1. #1
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    Regression question on tennis

    Hi,
    I am looking to improve my tennis model. It's pretty accurate, but still could be improved.



    I have a lot of game data, but the data only contains game, set and wins. I haven't found a way to get serve data for low level tennis tournaments, so it doesn't seem workable to use ATP & WTA serve data for some players, but nothing for others. If there's a way to compensate for that, please let me know.


    Anyway, I am reading 'Analytic Methods in Sports' and was looking at linear regression. The book is detailed on regression analysis, but not how to incorporate it into predicting a winner between two players. I've found quite strong correlations between winning matches, and winning games and sets. As you would expect. There's also strong, but not as strong correlations between just playing games and sets (win or lose) and winning matches. I guess the spuds only play a few games, then give up. The better players play at a level they can win and often? .

    Does this mean that the regression information is useless? I have the regression equations for both men and women for sets won predicting matches won. I'm not sure it helps hower. I haven't found a way to integrate it into my existing model that uses scores and takes into accounts opponents.

    Any tips on how you would go about using regression information in a model would be handy.

    Thanks.
    B.

  2. #2
    antonyp22
    antonyp22's Avatar Become A Pro!
    Join Date: 01-12-14
    Posts: 78
    Betpoints: 2528

    If we are talking about two outcomes i.e. winning or losing, then the way to go would be logistic regression
    Nomination(s):
    This post was nominated 1 time . To view the nominated thread please click here. People who nominated: Miz

  3. #3
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    Thanks for the reply.

    I found a youtube video explaining logistic regression in excel. I followed it and think I did it correctly. I suspect that the person doing the video used a very specific type of logistic regression or made an error. The reason is that the equation I got from this regression gives a probability of 0.5 for a player that has no games won. This is because the way the regression equation was presented p(x) = e^L/(1+e^L) will return 0.5 when L is 0, which would be the case when the constant is 0 (solver returned such a constant) and the predictor is 0.
    If I'm doing something wrong, let me know. I'm happy to post some of the data.
    Thanks again.

    By the way, Marich's mullet is a thing of beauty.

  4. #4
    antonyp22
    antonyp22's Avatar Become A Pro!
    Join Date: 01-12-14
    Posts: 78
    Betpoints: 2528

    First things first, the mullet is legendary

    I watched the video and didn't find it that helpful seems like a very specific example. I don't perform logistic regression in Excel I usually do it in R and when I do perform it in Excel I use 3rd party add ins like Sigma XL e.t.c.

    If you posted some of the data it might help to see where it has gone wrong.

    Check out the following link and see if it helps: http://www.real-statistics.com/logistic-regression/

  5. #5
    jtoler
    jtoler's Avatar Become A Pro!
    Join Date: 12-17-13
    Posts: 30,967
    Betpoints: 6337

    Why would you want to go about betting tennis of all sports like this?

  6. #6
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    I'm not sure what you mean. I have a model that I'm doing OK with, and I'd like to tweak it, so I thought maybe regression was the solution. If not, I'll try something else. Just asking for tips.

  7. #7
    jtoler
    jtoler's Avatar Become A Pro!
    Join Date: 12-17-13
    Posts: 30,967
    Betpoints: 6337

    Quote Originally Posted by Baeoz View Post
    I'm not sure what you mean. I have a model that I'm doing OK with, and I'd like to tweak it, so I thought maybe regression was the solution. If not, I'll try something else. Just asking for tips.
    Was just saying why tennis because player ability isnt even a factor sometimes. I mean how do you gauge that a player really doesnt want to win a match, really cant, can kinda go buy points they are defending. Also the form issue, how do you gauge that unless youve actually been watching them play, trying to gauge form by who they recently beat just starts a chain reaction since youd need to know the player's form that they beat, which is another reason for needing to actually watch the matches. Im just rambling not sure if anything Ive said has anything to do with what youre talking about, I only seldom come to this sub forum and you guys are really into what youre doing with the models and all, I should probably look more into such.

  8. #8
    biddy_24
    biddy_24's Avatar Become A Pro!
    Join Date: 10-09-11
    Posts: 136

    Tennis

    The most important factor in tennis is head to head record
    Last edited by biddy_24; 09-27-14 at 08:34 AM.

  9. #9
    jtoler
    jtoler's Avatar Become A Pro!
    Join Date: 12-17-13
    Posts: 30,967
    Betpoints: 6337

    Quote Originally Posted by biddy_24 View Post
    The most important factor in tennis is head to head record
    Hope youre joking.

  10. #10
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    Quote Originally Posted by jtoler View Post
    Was just saying why tennis because player ability isnt even a factor sometimes. I mean how do you gauge that a player really doesnt want to win a match, really cant, can kinda go buy points they are defending. Also the form issue, how do you gauge that unless youve actually been watching them play, trying to gauge form by who they recently beat just starts a chain reaction since youd need to know the player's form that they beat, which is another reason for needing to actually watch the matches. Im just rambling not sure if anything Ive said has anything to do with what youre talking about, I only seldom come to this sub forum and you guys are really into what youre doing with the models and all, I should probably look more into such.
    All valid concerns. I don't need to perfectly predict, just out predict the bookies. This is probably why I do ok on lower level and struggle on top level tournaments. The bookies and the pro betters would be watching all those important matches. I don't know if I can beat them consistently, doubtful. But I think I can at low level. I don't think they watch futures and itf 10,000 tournament in Outer Mongolia so closely. I hope that's the case, anyway.

    Unfortunately, not all low level tournaments are covered by bookies, and one changed the bet type from single to 3 way combo 2 days after I joined and was doing nicely on low-level tournaments. Because of those reasons, I'm hoping to improve my prediction rate at mid-level tournaments.
    Last edited by Baeoz; 09-27-14 at 06:38 PM.

  11. #11
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    And for the matches where players have not met?

  12. #12
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    Just a quick example of what I'm trying.
    In about an hour in Australia, K. Pearson plays S. Carson. Pearson is favourite at $1.31 and Carson $3.26. My model says Carson will win and so I back him. I have no idea who these guys are and I'm hoping the bookies have less.
    Last edited by Baeoz; 09-27-14 at 06:51 PM.

  13. #13
    jtoler
    jtoler's Avatar Become A Pro!
    Join Date: 12-17-13
    Posts: 30,967
    Betpoints: 6337

    Quote Originally Posted by Baeoz View Post
    And for the matches where players have not met?
    Just have to know the players and the form they are in, pts defending, strengths, weaknesses. Same as with previous h2h really.

  14. #14
    antonyp22
    antonyp22's Avatar Become A Pro!
    Join Date: 01-12-14
    Posts: 78
    Betpoints: 2528

    Baeoz have you made any progress with the logit model?

  15. #15
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    Baeoz have you made any progress with the logit model?

    Not yet. I was a bit busy drinking and watching the Swannies make out like they're a bunch of spuds yesterday. I'll look at it soon and let you know how I went.
    Last edited by Baeoz; 09-27-14 at 09:04 PM. Reason: Forgot to quote post I'm replying to.

  16. #16
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    Just have to know the players and the form they are in, pts defending, strengths, weaknesses. Same as with previous h2h really.
    Sounds about right, but I guess my point was that h2h doesn't help you out when they've not met, or haven't met in ages.

  17. #17
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    Quote Originally Posted by antonyp22 View Post
    First things first, the mullet is legendary

    I watched the video and didn't find it that helpful seems like a very specific example. I don't perform logistic regression in Excel I usually do it in R and when I do perform it in Excel I use 3rd party add ins like Sigma XL e.t.c.

    If you posted some of the data it might help to see where it has gone wrong.

    Check out the following link and see if it helps: http://www.real-statistics.com/logistic-regression/
    I found a reason why I was having issues with the first attempt at logistic regression. I think the regression was fine, but I was using the resultant coefficients wrong. I'm still working on the real statistics example, but already I found out one mistake I was making and some data inconsistency because of the way it suggested tabling data. Worth the effort for that alone.

  18. #18
    Baeoz
    Baeoz's Avatar Become A Pro!
    Join Date: 06-19-14
    Posts: 46
    Betpoints: 271

    Well, I've tried the logit, but I think I'm missing something. For example, if I run a logit on number of games / win I get results that sort of make sense (most wins in my dataset are 6-0 6-0) but how I feed that back into my model is not clear. A player might lose 0-6 0-6 one day because he's playing a gun and win 6-0 6-0 another because she's playing a relative spud.....

  19. #19
    antonyp22
    antonyp22's Avatar Become A Pro!
    Join Date: 01-12-14
    Posts: 78
    Betpoints: 2528

    From what you've said above I assume you're saying that results are skewed because of massive swings in performance by individual players? If so you may need to come up with a "strength of schedule" coefficient of some sort based on who a player has played in the past.

Top