1. #1
    Bsims
    Bsims's Avatar Become A Pro!
    Join Date: 02-03-09
    Posts: 827
    Betpoints: 13

    A way of evaluating predictive models reasonableness

    Many of us use programs, spreadsheets, data, etc. to build models predicting scores for contests we are interested in wagering on. We are faced with the question of how good are the predictions? The obvious answer is, do you make money with the model? But that can take a lot of bets to get comfortable with the results. It can also be expensive if the model isn’t very accurate.

    The next obvious approach is to back test with previous data. This is free, but a lot of work. You must get the previous data in a usable form, then write your model to iterate through the previous games generating the bets the model likes. This is the approach that I mainly rely on. One problem with this is that I only keep profitable bets. That doesn’t tell me anything about the other score predictions.

    Recently I’ve started using correlation analysis. I generate two correlations; one for each team’s predicted score and actual score. This is a particularly useful tool for comparing different predictive models. I’m currently doing CBB analysis and looking at 4 predictive models; LV (implied scores from several books using the average spread and totals lines), Like games (my like game system described on my blog and in another topic here), KenPom predictions, and predictions from a power rating system I developed years ago.

    The CBB system I’m developing will rely on 3 of the 4 models (I won’t be using the LV implied scores since the lines are generated from these). The problem is how to weight the other three. Hopefully the correlation study will provide some guidance. I’ll have the correlation results shortly.

  2. #2
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    I'd watch about the Kenpom predictions, especially when it comes to totals. Last year I got absolutely creamed at the end of the season and the tournment using a kenpom based system. Not that the same thing would happen to you, but it is something to keep in mind.

    I don't really bother to back test much anymore. After some many times where stuff worked in the backtest and failed in the live, I don't really waste alot of time .To be honest, my models are all several years in, so much of my recent work has been in better operational stuff. Usually, when I am changing something, I will run through the first few months of a season. I focus more on how close I am to the line. If I within a point or so in NBA of the closing line on 80+% of games, I know I have a decent model. For new stuff, I will generate close to a seasons worth of data, sometimes it works out really well, and sometimes it crashes and burns.

    Most of the time though, I am not doing anything too crazy, so if I am close to the line, I am pretty sure my model is good.

  3. #3
    HeeeHAWWWW
    HeeeHAWWWW's Avatar Become A Pro!
    Join Date: 06-13-08
    Posts: 5,487
    Betpoints: 578

    Brier scores work well for this: calculate it for your model's predictions, and for the implied prob of the market.

    It's not perfect because of differing subsets, but it's quick.
    Points Awarded:

    Optional gave HeeeHAWWWW 2 Betpoint(s) for this post.


  4. #4
    Bsims
    Bsims's Avatar Become A Pro!
    Join Date: 02-03-09
    Posts: 827
    Betpoints: 13

    Correlation Summary

    Here are the results of the correlations between the predicted scores and the actual scores for CBB games in 2017-18 season.

    Predictor Scores # Predictions Avg Regulation Score Team 1 Avg Pred Team 1 Corr 1 Avg Regulation Score Team 2 Avg Pred Team 2 Corr 2
    LV implied scores 3,971 69.6 70.0 0.524 74.6 75.0 0.572
    Like games average scores 3,505 70.3 71.0 0.498 73.5 74.1 0.508
    KenPom predicted scores 3,884 69.6 70.0 0.502 74.6 74.8 0.543
    Power ratings predicted scores 2,725 69.9 69.9 0.447 74.1 74.5 0.441
    Total Games on Scores File 3,975
    Games at Neutral Sites 590
    Percent of Neutral Site Games 14.8%

    Note that I only used the score at the end of regulation play, ignoring overtime. Normally team 1 is the visitor, and team 2 is the home team. For games at neutral sites, both are considered visitors. This may raise some questions. I’ll try to deal with some obvious ones in subsequent posts. I have put a spreadsheet with the source data and summary in the cloud. Hopefully you can access it via the following URL, bit.ly/2A1f8pE

  5. #5
    Bsims
    Bsims's Avatar Become A Pro!
    Join Date: 02-03-09
    Posts: 827
    Betpoints: 13

    Quote Originally Posted by HeeeHAWWWW View Post
    Brier scores work well for this: calculate it for your model's predictions, and for the implied prob of the market.

    It's not perfect because of differing subsets, but it's quick.
    Interesting, I'll have to learn more about this. I can think of some other applications.

  6. #6
    HeeeHAWWWW
    HeeeHAWWWW's Avatar Become A Pro!
    Join Date: 06-13-08
    Posts: 5,487
    Betpoints: 578

    Quote Originally Posted by Bsims View Post
    Interesting, I'll have to learn more about this. I can think of some other applications.
    Other possibles are the other proper scoring rules: logloss, and spherical loss. Logloss is usually the most practical of the three for most purposes, but given most bets are in the middle of the probability range, Brier is likely best for most people.

  7. #7
    nash13
    nash13's Avatar Become A Pro!
    Join Date: 01-21-14
    Posts: 1,122
    Betpoints: 7160

    https://mathematicalfootballpredictions.com/montecarlo/
    i guess here is enough to evaluate your betting process

  8. #8
    yak merchant
    yak merchant's Avatar Become A Pro!
    Join Date: 11-04-10
    Posts: 109
    Betpoints: 6170

    Quote Originally Posted by HeeeHAWWWW View Post
    Brier scores work well for this: calculate it for your model's predictions, and for the implied prob of the market.

    It's not perfect because of differing subsets, but it's quick.
    So how do you deal with interval/ratio data types with Brier scores? Do you convert everything to Moneyline probabilities or are you binning results? Every example I’ve ever seen is analyzing probabilities between Predicted and actual for Nominal or Ordinal types.

  9. #9
    HeeeHAWWWW
    HeeeHAWWWW's Avatar Become A Pro!
    Join Date: 06-13-08
    Posts: 5,487
    Betpoints: 578

    Quote Originally Posted by yak merchant View Post
    So how do you deal with interval/ratio data types with Brier scores? Do you convert everything to Moneyline probabilities or are you binning results?
    No need for binning, it inherently calibrates across the whole range.

    All you need is the (binary) outcome, and prediction %.
    Quote Originally Posted by yak merchant View Post
    So how do you deal with interval/ratio data types with Brier scores? Do you convert everything to Moneyline probabilities or are you binning results?
    No need for binning, it inherently calibrates across the whole range.

    All you need is the (binary) outcome, and prediction %. This is a superior metric than traditional ones using binary outcomes vs binary predictions (eg accuracy, Kappa, AUC etc), because those are throwing away a lot of info about the prediction.
    Last edited by HeeeHAWWWW; 12-17-18 at 04:10 PM.

  10. #10
    yak merchant
    yak merchant's Avatar Become A Pro!
    Join Date: 11-04-10
    Posts: 109
    Betpoints: 6170

    Quote Originally Posted by HeeeHAWWWW View Post
    No need for binning, it inherently calibrates across the whole range.

    All you need is the (binary) outcome, and prediction %.

    No need for binning, it inherently calibrates across the whole range.

    All you need is the (binary) outcome, and prediction %. This is a superior metric than traditional ones using binary outcomes vs binary predictions (eg accuracy, Kappa, AUC etc), because those are throwing away a lot of info about the prediction.
    Well I guess that is my question the model in question is comparing predicted scores to actually scores not a binary outcome.

  11. #11
    peacebyinches
    pull the trigger
    peacebyinches's Avatar SBR PRO
    Join Date: 02-13-10
    Posts: 1,108
    Betpoints: 7802

    I look forward to seeing how this works out brims

  12. #12
    HeeeHAWWWW
    HeeeHAWWWW's Avatar Become A Pro!
    Join Date: 06-13-08
    Posts: 5,487
    Betpoints: 578

    Quote Originally Posted by yak merchant View Post
    Well I guess that is my question the model in question is comparing predicted scores to actually scores not a binary outcome.
    AHh, gotcha. I suppose you could use traditional regression metrics, mean squared error etc, take your predicted line and the market's middle point. Problematic in lower scoring sports though, or those with irregular scoring distributions.

    Binary over/under or a particular handicap also has the nice advantage of focusing your prediction efforts on improving accuracy in the area that matters - ie exactly the thing you're trying to predict and bet on.

  13. #13
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    I think closing line predictions are predictive than actual scores of past games. The big issue I see with the idea is the injuries, rest, suspensions of players that actually change the line. Perfect example is the Rockets without Harden is a different team without Harden. Also considering that CBB teams are very different from day one to the next season especially with loss of superstar one and dones.

  14. #14
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Quote Originally Posted by danshan11 View Post
    I think closing line predictions are predictive than actual scores of past games. The big issue I see with the idea is the injuries, rest, suspensions of players that actually change the line. Perfect example is the Rockets without Harden is a different team without Harden. Also considering that CBB teams are very different from day one to the next season especially with loss of superstar one and dones.
    If you are actually testing realistically, you should account for injuries. When I was testing NBA models, set up a scraper that would scrap the games line ups for a particular day. All I had to do was to hit 2 buttons, one to pull the lineup and one to process the results.

    If you are testing CBB it is a little different. But you should account for returning starters when projecting next year. I calculated returning minutes, and went from there.
    Points Awarded:

    KVB gave Waterstpub87 2 Betpoint(s) for this post.


  15. #15
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    I dont think his model is doing that and in order to do it successfully you need an algo for player worth, I use a team weight system and give each player value and compare that to total team value!

  16. #16
    Bsims
    Bsims's Avatar Become A Pro!
    Join Date: 02-03-09
    Posts: 827
    Betpoints: 13

    Quote Originally Posted by danshan11 View Post
    I think closing line predictions are predictive than actual scores of past games. The big issue I see with the idea is the injuries, rest, suspensions of players that actually change the line. Perfect example is the Rockets without Harden is a different team without Harden. Also considering that CBB teams are very different from day one to the next season especially with loss of superstar one and dones.
    Agree. The problem with any handicapping or predictive model is that unknown information like injuries will result in some wagers will look too good. Somehow one must account for these and be leery of these wagers. I tend to compute a return per dollar and bet on those with returns above $1.00. If the return is something like $1.25, be very careful.

    Your second point is also good. CBB is a good example of where a team might change significantly from year to year. Of the 4 models , the LV one and like games (since it comes from LV) probably are the best early on. KenPom probably considers player changes. I'm skeptical about how well this can be done. The power rating system won't generate ratings for a team until it has scores for at least 3 games at the appropriate site. That's why it has about a thousand less games than the others.

    I'm planning on a follow up study that will look at correlations by month. I would expect the power rating system to improve the most. In a previous study the ratings got better with more data.

  17. #17
    Bsims
    Bsims's Avatar Become A Pro!
    Join Date: 02-03-09
    Posts: 827
    Betpoints: 13

    One issue I always face is how to account for home court advantage. Three of the four models take this in account. The power rating system alone faces this problem. One approach is to adjust the predicted scores by some home court advantage. I don't like this approach.

    Since basketball teams play lots of games, I look at each team as two different teams, one on the road and one at home. Thus I have two ratings for Duke, one for vDuke and the other for hDuke.

  18. #18
    tsty
    tsty's Avatar Become A Pro!
    Join Date: 04-27-16
    Posts: 510
    Betpoints: 4345

    You can do regression with past odds instead of results? Lol

  19. #19
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    Quote Originally Posted by Bsims View Post
    One issue I always face is how to account for home court advantage. Three of the four models take this in account. The power rating system alone faces this problem. One approach is to adjust the predicted scores by some home court advantage. I don't like this approach.

    Since basketball teams play lots of games, I look at each team as two different teams, one on the road and one at home. Thus I have two ratings for Duke, one for vDuke and the other for hDuke.
    You have to consider it in per possession, not flat. Consider that much of the home vs away is things like penalties and fouls. If a team produces .25 less fouls per possession, 60 vs 100 possessions makes a large amount of difference.

    I've always been the opposite on Home vs away. By the time you get to 10 home and 10 away, most of the season is gone. So at this point, you are probably somewhere around 4 home, 2 neutral, and 2 away or something similar. Any results that you get, especially exterme ones, are much more likely to be random, and not an actual signal.

    If instead, you use a constant, you can use thousands of games to generate the home vs away advantage, meaning the number is much more likely to be actually valid. In cbb, this may not be exact, because many teams play weaker teams at home, like duke playing abiline christian in the first game of the season or something like that. Also, some teams, denver comes to mind, benefit extra because the conditions are more extreme there. But in general, this is a much cleaner and more accurate approach.

  20. #20
    Bsims
    Bsims's Avatar Become A Pro!
    Join Date: 02-03-09
    Posts: 827
    Betpoints: 13

    If I were to use home court advantage, I'd probably use KenPom's instead of a constant value. Currently his biggest HCA's are for Colorado 4.5 and Iowa State 4.4. His lowest are Grambling St. and Navy 1.6. His median is 3.2.

  21. #21
    HeeeHAWWWW
    HeeeHAWWWW's Avatar Become A Pro!
    Join Date: 06-13-08
    Posts: 5,487
    Betpoints: 578

    Quote Originally Posted by Bsims View Post
    I tend to compute a return per dollar and bet on those with returns above $1.00. If the return is something like $1.25, be very careful.
    Strongly agree with this (at least in any liquid market). You can prove it with sufficient betting history too: your edge estimates have errors, and as the edge increases, typically those will become asymmetrical - ie the real edge will be well below your estimate.

    There's a good logical explanation: very large edges represent where the market knows something your model doesn't.

    For anyone using Kelly this all becomes rather important :-)
    Nomination(s):
    This post was nominated 1 time . To view the nominated thread please click here. People who nominated: u21c3f6

  22. #22
    tsty
    tsty's Avatar Become A Pro!
    Join Date: 04-27-16
    Posts: 510
    Betpoints: 4345

    Quote Originally Posted by HeeeHAWWWW View Post
    Strongly agree with this (at least in any liquid market). You can prove it with sufficient betting history too: your edge estimates have errors, and as the edge increases, typically those will become asymmetrical - ie the real edge will be well below your estimate.

    There's a good logical explanation: very large edges represent where the market knows something your model doesn't.

    For anyone using Kelly this all becomes rather important :-)
    Selectively following your model is wrong imo

    Either 100 or nothing

  23. #23
    Bsims
    Bsims's Avatar Become A Pro!
    Join Date: 02-03-09
    Posts: 827
    Betpoints: 13

    I've eliminated the neutral site games. All the correlations went up a bit. Each model does a better job of predicting the home score than the visitors, except the power rating system. Maybe I need to rethink my home court advantage.

    Predictor Scores (eliminating neutral site games) # Predictions Avg Regulation Score Team 1 Avg Pred Team 1 Corr 1 Avg Regulation Score Team 2 Avg Pred Team 2 Corr 2
    LV implied scores 3,381 69.6 70.0 0.530 74.9 75.1 0.585
    Like games average scores 2,956 70.4 71.1 0.502 73.7 74.1 0.516
    KenPom predicted scores 3,311 69.7 69.9 0.510 74.9 74.9 0.554
    Power ratings predicted scores 2,411 70.2 70.0 0.455 74.3 74.6 0.445

  24. #24
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    Quote Originally Posted by tsty View Post
    You can do regression with past odds instead of results? Lol
    what is more accurate as a predictor of future scores. The total for a team at closing of 31 points or the actual score of 67 since the starting center of the opponent had his worst night in his career?

    if the books have Yale with totals of
    31, 33, 35, 41, 39
    and the actual scores were
    39, 20, 33, 29, 65
    which do you think is more indicative of their next game score
    37.2 actual score avg or
    35.8 which was the line

  25. #25
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    really I dont see the idea or edge in doing this, you are not doing anything more advanced than even a basic model. I would not see how this system could give you any edge. Do you think it is possible to use this to more accurately predict than the closing line can?

  26. #26
    vampire assassin
    vampire assassin's Avatar Become A Pro!
    Join Date: 03-09-18
    Posts: 279
    Betpoints: 9896

    If you look at the set of wagers where your projected ROR is >10%, these will typically due worse than your 3-6% range. As you said, there is an injury or other big change, and your +EV bet has turned into a coin flip.

    If you have a large data set, you can flag matches >10% (or <-10%), or find the sweet spot where you discard matches due to informational disadvantage. If you do this when betting, you'll save a fortune. I lost a 6-fig fortune on the sum of these small positives.

  27. #27
    u21c3f6
    u21c3f6's Avatar Become A Pro!
    Join Date: 01-17-09
    Posts: 790
    Betpoints: 5198

    Quote Originally Posted by HeeeHAWWWW View Post
    ...
    There's a good logical explanation: very large edges represent where the market knows something your model doesn't. ...
    Ding, ding, ding!!! We have a winner! (From my point of view)

    The above is in large part the focus of what I look for when making selections. You see this phenomenon mentioned in various forms in many threads (think "lock" threads for one form) but not many actually try to use this to their advantage IMO.

    Joe.

  28. #28
    ChuckyTheGoat
    ChuckyTheGoat's Avatar SBR PRO
    Join Date: 04-04-11
    Posts: 31,504
    Betpoints: 24857

    Good work, Bsims. Best of luck.

  29. #29
    tsty
    tsty's Avatar Become A Pro!
    Join Date: 04-27-16
    Posts: 510
    Betpoints: 4345

    Quote Originally Posted by danshan11 View Post
    what is more accurate as a predictor of future scores. The total for a team at closing of 31 points or the actual score of 67 since the starting center of the opponent had his worst night in his career?

    if the books have Yale with totals of
    31, 33, 35, 41, 39
    and the actual scores were
    39, 20, 33, 29, 65
    which do you think is more indicative of their next game score
    37.2 actual score avg or
    35.8 which was the line
    How do you write a model without using past results? It's literally the only way lol

    Using past odds is retarded since it was less accurate in the past

  30. #30
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    Quote Originally Posted by tsty View Post
    How do you write a model without using past results? It's literally the only way lol

    Using past odds is retarded since it was less accurate in the past
    because past results are not indicative of future performance past lines are better.
    a team win 10 games straight by 40 points is that more indicative of their power ranking as -40 favorites or is the avg line of -8 more accurate of future performance. Also past lines are a collaboration of past game results.

    I think the avg score of Yankees is 12 runs last 10 is less indicative of the offense power as the avg team total line of 7.5 in last 10
    I would use the 7.5 not the 12, the 7.5 is better indicator of future performance than the 12

    example you take Kluber in his last game there were 9 runs scored
    do you think that 9 is a better number than the total of 6.5 for future games, which is more indicative of future performance, the line or the result?

    when i say past results I am saying last 10 games up to a season not last 25 years

  31. #31
    tsty
    tsty's Avatar Become A Pro!
    Join Date: 04-27-16
    Posts: 510
    Betpoints: 4345

    Lol u just completely ignore my question but w.e

    Ill ask a different one then

    How did the bookies make those odds? Where were they derived from?

  32. #32
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    lines are made with power rankings, weather, injuries and I believe books adjust for teams and situations that they have tons of data on, such as Patriots at home probably gets a little extra push from the books even though rankings say X they are probably X plus a dash of salt.

  33. #33
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    Quote Originally Posted by tsty View Post
    Lol u just completely ignore my question but w.e

    Ill ask a different one then

    How did the bookies make those odds? Where were they derived from?
    you did not answer any of my questions

  34. #34
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    Quote Originally Posted by danshan11 View Post
    because past results are not indicative of future performance past lines are better.
    a team win 10 games straight by 40 points is that more indicative of their power ranking as -40 favorites or is the avg line of -8 more accurate of future performance. Also past lines are a collaboration of past game results.

    I think the avg score of Yankees is 12 runs last 10 is less indicative of the offense power as the avg team total line of 7.5 in last 10
    I would use the 7.5 not the 12, the 7.5 is better indicator of future performance than the 12

    example you take Kluber in his last game there were 9 runs scored
    do you think that 9 is a better number than the total of 6.5 for future games, which is more indicative of future performance, the line or the result?

    when i say past results I am saying last 10 games up to a season not last 25 years
    I bolded the question to help you see it

  35. #35
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    I also just read that some books are now focusing more on line history over power rankings to try and get the line more stable start to finish

123 Last
Top