1. #1
    simplydusty
    simplydusty's Avatar Become A Pro!
    Join Date: 12-18-08
    Posts: 229
    Betpoints: 24

    Back fitting to create a system?

    I was using a database of the past six MLB seasons to come up with some kind of system to use this year. The expression I came up with got pretty long and complicated, but the numbers are great. For the past six seasons ($100 units) the best was up 16k and the worst was up 12k. It's just based on stats like hits, strikeouts, starter innings pitched, and about 15 others with the line somewhere between -115 and +170. The average number of plays per season is around 320. Is there any reason to think that this season should be any different?

  2. #2
    LT Profits
    LT Profits's Avatar Become A Pro!
    Join Date: 10-27-06
    Posts: 90,963
    Betpoints: 5179

    The more complicated a formula, the greater probability of data mining and thus the less predictive the formula is for the future. That said, you can still validate your formula by back-testing it further over untouched data.

  3. #3
    simplydusty
    simplydusty's Avatar Become A Pro!
    Join Date: 12-18-08
    Posts: 229
    Betpoints: 24

    I used the MLB database at killersports.com and it only goes back to 2004. I guess I can just take it slow for the first month or two and see how 2010 is doing compared to the other seasons' first few months.

  4. #4
    MadTiger
    Wait 'til next year!
    MadTiger's Avatar Become A Pro!
    Join Date: 04-19-09
    Posts: 2,724
    Betpoints: 47

    Quote Originally Posted by LT Profits View Post
    The more complicated a formula, the greater probability of data mining and thus the less predictive the formula is for the future. That said, you can still validate your formula by back-testing it further over untouched data.

    This.

    Standard (well, as far as I know) Operating Procedure in stats is to test with a DIFFERENT set than what you used to create the model.

  5. #5
    mminkovski
    mminkovski's Avatar SBR PRO
    Join Date: 06-22-07
    Posts: 1,056
    Betpoints: 17641

    As long as it's not a chase system with risking 100 units to win 1 you will be fine

  6. #6
    jessetopolski
    912
    jessetopolski's Avatar Become A Pro!
    Join Date: 12-20-09
    Posts: 162

    did steve ever find out why the juice doubled

  7. #7
    ljump12
    ljump12's Avatar Become A Pro!
    Join Date: 12-08-09
    Posts: 108
    Betpoints: 258

    Quote Originally Posted by LT Profits View Post
    The more complicated a formula, the greater probability of data mining and thus the less predictive the formula is for the future. That said, you can still validate your formula by back-testing it further over untouched data.
    This. However, I believe the correct term is "data-snooping"

  8. #8
    kingofmonash
    my crap
    kingofmonash's Avatar Become A Pro!
    Join Date: 04-11-10
    Posts: 631
    Betpoints: 250

    makes sense

  9. #9
    Flying Dutchman
    Floggings continue until morale improves
    Flying Dutchman's Avatar Become A Pro!
    Join Date: 05-17-09
    Posts: 2,467
    Betpoints: 759

    Quote Originally Posted by ljump12 View Post
    This. However, I believe the correct term is "data-snooping"
    Yeah, Snoop-dog invented it.

  10. #10
    Wrecktangle
    Wrecktangle's Avatar Become A Pro!
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    "data sets are like prisoners of war, if you torture them long enough, they will admit to anything"

    Unfortunately, the medical community in their drug testing seem to have never heard of this old stat chestnut.

  11. #11
    sycoogtit
    play matchbook
    sycoogtit's Avatar Become A Pro!
    Join Date: 02-11-10
    Posts: 322

    Quote Originally Posted by MadTiger View Post
    This. Standard (well, as far as I know) Operating Procedure in stats is to test with a DIFFERENT set than what you used to create the model.
    In Wong's book Sharp Sports Betting he says the same thing, but he also says you can use the same data to backtest IF the win-loss record from that testing has a 'rarity' of at least 1 in 10,000 (which I've found is really hard to do). I forget what he defines rarity as, but you can download a spreadsheet from his site and plug in your wins and losses to see if your model is good enough to put money on. You can download it from http://www.sharpsportsbetting.com/docs/prop_tools.shtml. Go to the 'Rarity of W-L record' tab. It will show you the rarity in percentage form and 1 in X form.

  12. #12
    nachtreter
    nachtreter's Avatar Become A Pro!
    Join Date: 04-28-10
    Posts: 1

    how is your system doing in the current season until now?

  13. #13
    Siksid
    Siksid's Avatar Become A Pro!
    Join Date: 04-26-10
    Posts: 66
    Betpoints: 28

    does this formula work if so what are your picks to compriehend your thought's ?

  14. #14
    DukeJohn
    DukeJohn's Avatar Become A Pro!
    Join Date: 12-29-07
    Posts: 1,779
    Betpoints: 254

    Quote Originally Posted by LT Profits View Post
    The more complicated a formula, the greater probability of data mining and thus the less predictive the formula is for the future. That said, you can still validate your formula by back-testing it further over untouched data.
    LT is 100% correct. You can not data mine and expect it to be profitable in the future. There are plenty of stat places out there to go beyond 2004. Test your "system" on fresh data, at least 10 years of fresh numbers, if ya can't do that, then go back as far as you can; unless you are comfortable with losing your money over the next few years of forward testing. You might as well save some time and money and back test it further.

Top