1. #1
    ProphetofProfit
    ProphetofProfit's Avatar Become A Pro!
    Join Date: 03-24-11
    Posts: 26

    Backtesting - Should I bother?

    My model is nearly complete, and I'm considering backtesting using Pinnacle odds for a few hundred games to check if the model is profitable or not. Unfortunately, the odds aren't easily available to me, and I'll have to spend many many hours manually inputting them. Add to that the fact that I don't have injury news to teams playing these games, and if I wanted injury news, I'd have to spend about 5 minutes per team per game to get it, so we're talking a massive amount of time.

    Would I be better off simply testing my model with different parameters to get the lowest R squared number possible, then gambling with money I'm comfortable losing? It's a lot less hassle.

    R squared here is the squared difference between the actual goals scored and my models predicted goals scored. I think that's right.

    n.b.

    Is R squared the best estimator of accuracy or is there something else I should be using?

  2. #2
    mjespoz
    mjespoz's Avatar Become A Pro!
    Join Date: 02-15-11
    Posts: 42
    Betpoints: 543

    From what you've said, R= residual (observed - predicted). So minimizing R squared should be a good strategy. However, you should always properly back test any model - using data that wasn't used to build the model.

    Good luck!

  3. #3
    donkson
    donkson's Avatar Become A Pro!
    Join Date: 03-12-11
    Posts: 411
    Betpoints: 1797

    You want to maximise R^2 not minimize it.

    It doesn't mean residual, its called the coefficient of determination.

  4. #4
    Wrecktangle
    Wrecktangle's Avatar Become A Pro!
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    Why backtest when you can pray?

  5. #5
    illfuuptn
    illfuuptn's Avatar Become A Pro!
    Join Date: 03-17-10
    Posts: 1,860

    lol @ backtesting. Bet the money you are comfortable with losing for a month or so. See if the line movement agrees with you. It's as simple as that.

  6. #6
    PanamaBrad
    PanamaBrad's Avatar Become A Pro!
    Join Date: 03-22-11
    Posts: 717
    Betpoints: 395

    If you think its worthy, spend the time to put together a no risk minimum of 500 game sample. What if it took 40-80 hours to do it? That is a small price to pay to establish a beginning data base
    Points Awarded:

    Justin7 gave PanamaBrad 2 SBR Point(s) for this post.


  7. #7
    Miz
    Miz's Avatar Become A Pro!
    Join Date: 08-30-09
    Posts: 695
    Betpoints: 3162

    backtesting multiple years can save you money. I have seen some really promising models do well for over 1000 games, only to come down to earth in the next 1000.

    If you are betting for fun, then test it forward, have a beer and watch the game. If you really want to be serious about it, I think you need to put the work in and test your idea objectively.

    btw, there is nothing wrong with just having some fun with this, just don't expect to just stumble across a winning idea. Ideas usually take refining before they work well.

  8. #8
    Maniac
    Maniac's Avatar Become A Pro!
    Join Date: 04-12-11
    Posts: 667
    Betpoints: 8815

    Have been working on something myself for the last couple of months and have been both backtesting and also testing it as it goes along with small bets ($20) per selection. Any tweaks I have been making to the running model, will also be backtested for the last 3 years worth of games/lines with the intention of (hopefuly!) having a more accurate model ready to go for the start of the next NBA season...assuming no lockout that is !

    For me, there are a number of different factors that I want to look at and I also want to see if there is a particular difference in the accuracy of the model over the course of a season, to see if it performs stronger at the start/at the end/mid-way through whatever - so for me, 3 years minimum is what I am looking at, and may even go back another couple of years IF there are any particular patterns that I find in the data.

    A lot does depend on exactly how long it would take you to find the info you are looking for and input it manually, and how easy the info is to come by will depend on what sport and exactly what info you are looking at. In terms of NBA, MLB, NFL etc I have found it relatively easy to find various databases/odds archives that have a lot of the info I need to backtest...other sports, might not be so easy !

  9. #9
    InTheRed
    InTheRed's Avatar SBR PRO
    Join Date: 12-25-09
    Posts: 455
    Betpoints: 17596

    I had an NBA system that I set up. Instead of backtesting, I ran it through the 2010-11 season, betting casually and small. Didn't play every system play. But kept a record of the entire season. Also, took notes, and kept an eye on certain parameters for the system (difference in numbers, b2b, etc.) so I had records for those variables as well.

    Now I have a complete season of "backtesting" that I've done and a direction for where I want to be for next year along with certain variables that I may want to add for next season.

    The NBA isn't going anywhere and I still have a life of betting in front of me. A system is a living thing, it will always be changing, hopefully for the better.

    If its a system you truely believe in, take the time and years to get it set up and running where you are comfortable in betting larger amounts.

  10. #10
    ManBearPig
    ManBearPig's Avatar Become A Pro!
    Join Date: 12-04-08
    Posts: 2,473

    I think testing for statistical significance would be one of the most important steps. It's like building the car of your dreams but not test driving and not even getting down the street...wouldn't you rather get to the store or across state lines??? It may be a pain in the ass but I think it's worth the time.

    Gathering data to establish a W/L record and find z-scores and standard errors will go a long way in telling you if you have something viable long term or if you just found a flash in the pan. If the relationship you're testing is strong the more plays you get the results should remain constant. Although 2 standard deviations is strong indicator, you have to be careful of false positives obtained over a short period of time as you may have only maximized one angle or hypothesis.

    It could be that you found a strong relationship for a short blip in time that hits at 57% over 200 plays and then spits out 50% winners going forward, which will obviously bring down your numbers that looked initially so promising. It's much easier to see this by back-testing to try and find a maintainable relationship that will allow you to predict future winners at a winning rate and not past. Aiming for more extreme results like higher STDev and establishing multiple hypothesis' will help.

    ...or just ride it out and see what happens where nothing is guaranteed.

  11. #11
    ProphetofProfit
    ProphetofProfit's Avatar Become A Pro!
    Join Date: 03-24-11
    Posts: 26

    Thanks for the responses. Maybe I'm being an idiot but I'll probably not backtest because although it's a good idea in theory, in practice, it'll take an insane amount of time to get injuries and opening odds. Closing odds and assuming that all teams are at 100% strength is quicker, but the results will not be informative. And this will still take be a good 50 hours since each game has to go through a macro that takes about 5 seconds to run, and this season, if I test that, I've about 5000 games. Yadda yadda I can't deal with the hassle.

    So in the interest of speed and sanity, my approach is to start with my full bankroll, but bet extremely conservatively to start with, say 1/20 kelly. As I go along and record results, I'll run a few chi-squared tests to estimate the confidence that my method is better than breakeven, as confidence improves, the kelly fraction increases.

    Right now I'm about 80% confident that I'm +EV, so 1/20 kelly.
    When I hit 90% with the chi-test, 1/10
    95% it's 1/5
    99% it's the max I'm comfortable with.

    Any gaping holes in my reasoning?

    80% estimate is created from the logic behind the model being sound, and the fact that I'm betting against European football, which isn't densely populated with sharps. And the overround I'm getting is 102% at worst.

  12. #12
    andywend
    andywend's Avatar Become A Pro!
    Join Date: 05-20-07
    Posts: 4,805
    Betpoints: 244

    A system is a living thing, it will always be changing, hopefully for the better.
    The vast majority (if not all) of betting systems are absolutely worthless.

    You will always be able to find trends that do very well if the sample is small enough (if you flip a coin 10 times, it just might come up tails on 7 occassions but that doesn't mean its more likely to land on tails on the 11th flip).

    The only system I have used that seems to be worth anything is to bet the under in the 2nd half when the favorite is blowing out the underdog in the first half and to bet the over during live betting when a heavy favorite is losing straight up late in the 3rd quarter/early 4th quarter of games.

  13. #13
    roasthawg
    roasthawg's Avatar Become A Pro!
    Join Date: 11-09-07
    Posts: 2,990

    Quote Originally Posted by ProphetofProfit View Post
    Thanks for the responses. Maybe I'm being an idiot but I'll probably not backtest because although it's a good idea in theory, in practice, it'll take an insane amount of time to get injuries and opening odds. Closing odds and assuming that all teams are at 100% strength is quicker, but the results will not be informative. And this will still take be a good 50 hours since each game has to go through a macro that takes about 5 seconds to run, and this season, if I test that, I've about 5000 games. Yadda yadda I can't deal with the hassle.

    So in the interest of speed and sanity, my approach is to start with my full bankroll, but bet extremely conservatively to start with, say 1/20 kelly. As I go along and record results, I'll run a few chi-squared tests to estimate the confidence that my method is better than breakeven, as confidence improves, the kelly fraction increases.

    Right now I'm about 80% confident that I'm +EV, so 1/20 kelly.
    When I hit 90% with the chi-test, 1/10
    95% it's 1/5
    99% it's the max I'm comfortable with.

    Any gaping holes in my reasoning?

    80% estimate is created from the logic behind the model being sound, and the fact that I'm betting against European football, which isn't densely populated with sharps. And the overround I'm getting is 102% at worst.
    Solid approach.

  14. #14
    Justin7
    Justin7's Avatar Become A Pro!
    Join Date: 07-31-06
    Posts: 8,577
    Betpoints: 1506

    I wouldn't bet right off. First, track imaginary plays. If your approach is valid, you will know within a couple days tracking line movements. If it is not obvious within your first 40 plays (where you should expect to see at least 25 of those plays agree in line movement), then wait until your sample size gets larger.

  15. #15
    laxbrah420
    laxbrah420's Avatar Become A Pro!
    Join Date: 10-29-10
    Posts: 210
    Betpoints: 505

    Quote Originally Posted by InTheRed View Post

    The NBA isn't going anywhere
    This is the problem with backtesting. Might be hard to forsee the lockout.

  16. #16
    Peregrine Stoop
    Peregrine Stoop's Avatar Become A Pro!
    Join Date: 10-23-09
    Posts: 869
    Betpoints: 779

    only if you want to know the size of your back rolls

  17. #17
    Soderman
    Soderman's Avatar Become A Pro!
    Join Date: 09-29-10
    Posts: 33

    Quote Originally Posted by Justin7 View Post
    I wouldn't bet right off. First, track imaginary plays. If your approach is valid, you will know within a couple days tracking line movements. If it is not obvious within your first 40 plays (where you should expect to see at least 25 of those plays agree in line movement), then wait until your sample size gets larger.
    Why at least 25 out of 40? And does it have something to do with the lines as well? I'm just curious.

  18. #18
    ProphetofProfit
    ProphetofProfit's Avatar Become A Pro!
    Join Date: 03-24-11
    Posts: 26

    I've been making some slow progress on this, and I've found that model number 1, the simplest I have, is on average 0.8 goals from the actual result. Which means little to me, since I have nothing to compare it to. I thought that if the goals score in a game are approximately a poisson distribution with the mean being the true average goals scored.

    For example, suppose that on average a team will score 1.5 goals, with a poisson distribution of the goals.

    p(# goals), error, error x p(# goals)
    p(0) = 0.22, 1.5, 0.33
    p(1) = 0.33, 0.5, 0.16
    p(2) = 0.25, 0.5, 0.13
    p(3) = 0.13, 1.5, 0.19
    p(4) = 0.05, 2.5, 0.11
    p(5) = 0.01, 3.5, 0.05
    p(6) = 0.00, 4.5, 0.02

    The sum of 'error x p(goals)' = 1

    So does that mean that even if you could predict the average goals scored with 100% accuracy, your mean error would be 1 goal, assuming that goals distribution is approximately poisson?

  19. #19
    byronbb
    byronbb's Avatar Become A Pro!
    Join Date: 11-13-08
    Posts: 3,067
    Betpoints: 2284

    You can build a model but can't write a program to scrape covers.com?

  20. #20
    ProphetofProfit
    ProphetofProfit's Avatar Become A Pro!
    Join Date: 03-24-11
    Posts: 26

    Why, are they the same thing?

  21. #21
    byronbb
    byronbb's Avatar Become A Pro!
    Join Date: 11-13-08
    Posts: 3,067
    Betpoints: 2284

    Quote Originally Posted by ProphetofProfit View Post
    Why, are they the same thing?
    No, scraping covers.com is easier.

  22. #22
    PatrickBateman
    PatrickBateman's Avatar Become A Pro!
    Join Date: 03-29-08
    Posts: 367

    Everything is worth looking into before betting real money on it...trust me you can only help yourself in the end. Just make sure to take a large enough sample size

  23. #23
    brewers7
    brewers7's Avatar Become A Pro!
    Join Date: 03-11-06
    Posts: 298
    Betpoints: 4441

    Quote Originally Posted by ProphetofProfit View Post
    My model is nearly complete, and I'm considering backtesting using Pinnacle odds for a few hundred games to check if the model is profitable or not. Unfortunately, the odds aren't easily available to me, and I'll have to spend many many hours manually inputting them. Add to that the fact that I don't have injury news to teams playing these games, and if I wanted injury news, I'd have to spend about 5 minutes per team per game to get it, so we're talking a massive amount of time.

    Would I be better off simply testing my model with different parameters to get the lowest R squared number possible, then gambling with money I'm comfortable losing? It's a lot less hassle.

    R squared here is the squared difference between the actual goals scored and my models predicted goals scored. I think that's right.

    n.b.

    Is R squared the best estimator of accuracy or is there something else I should be using?

    If you're looking to backtest NBA models, I have 20 full seasons worth of lines and totals data back to 1991-92 and once I get some more playoffs data done, I'll have another 6 full seasons before that...

  24. #24
    dodo_molnar
    dodo_molnar's Avatar Become A Pro!
    Join Date: 06-02-14
    Posts: 3
    Betpoints: 24

    Hi guys,
    I think the best on-line software in backtesting is betviz.

    The ability to see how a certain system in betting has performed in the past, can save a lot of time.
    For example
    When the favorite team has a losing streak, I want to know, when will be finished. In Betviz you can find team from the past with same conditions (average goals, form,etc) The big advantage for me is , I find out how the matches finished and if the odds are on my site.

  25. #25
    lamichaeljames
    lamichaeljames's Avatar Become A Pro!
    Join Date: 06-02-14
    Posts: 40
    Betpoints: 109

    good stuff here.

Top