I'm relatively new to sports betting but not new to gambling. I have always taken an analytical approach through research and data analysis, but I was worried about data snooping when collecting data from previous years to find profitable betting strategies. I was wondering how big of a problem it is and at what point is the sample size large enough to be reliable?
Data snooping
Collapse
X
-
clarkacalRestricted User
- 11-03-09
- 353
#1Data snoopingTags: None -
WojoSBR MVP
- 03-19-10
- 1764
#2I don't understand what you mean by data snooping.Comment -
clarkacalRestricted User
- 11-03-09
- 353
#3When you think you've found parameters which have +ev but turn out to be only +ev for a specific set of data and not necessarily for future dataComment -
mathdotcomSBR Posting Legend
- 03-24-08
- 11689
#4This is the classic question in inference and it does not have a universal answer.
Depends on the sport, whether the rules changed recently [hockey lockout, steroid crackdown], etc. etc.
You will never know if something that has been profitable every year til now will be profitable in 2010.Comment -
yankeerickSBR MVP
- 09-07-09
- 1171
#5Comment -
clarkacalRestricted User
- 11-03-09
- 353
#6Mathdotcom I understand u cant be sure, that's why it's gambling. But what I'm wondering is whether the initial theory is simply a result of that particular data or is it from a large enough sample that it can be expected to repeat with approximately the same results.
For example on a game with known odds such as craps (but you didn't know the odds you were just testing data)you gathered data from 5000 rolls of the dice.
theory a: a seven rolls more than an 8 at 6.25:5 and a seven rolls more than a 6 at 5.75:5
theory b: first time left handed female shooters have an average roll of 7.75 so you should place the high numbers
Theory a isn't perfect but it approximates the true odds. Theory b is obviously a joke and you'll go broke, but if you're only dealing with numbers in your theory it isn't as obvious. How do you know which category your theory fits into?Comment -
Flying DutchmanSBR MVP
- 05-17-09
- 2467
#7
some folks might also refer to this as overfitting...
Comment -
SportsbetTrackerSBR Rookie
- 04-30-10
- 26
#8I have been testing systems myself. In fact, it's my major focus. I am in the process of obtaining every box score for the four professional sports and NCAA Division I for basketball and football since 2000. For now, I the final scores and final lines of all teams for all events, which helps me with some systems (Morrison, et. al.).
Let's face it, sports gamblers are people who A) have a sense of rationality, but also, B) a desire to "fit" facts to theories, rather than let the theories acknowledge the facts. For the most part, system players promote A heavily, while implying B just as heavily. They don't consider the unstated C) Facts are facts and cannot be modfied.
To that end, though, sports betting IS based upon the HUMAN factor, and NOT the physics factor. Vegas was built on craps, roulette, and slots, NOT on the Super Bowl or the NBA Finals. So there is always going to be handicapping the events, and with every event played, there is another event that can be scrutinized and analyzed for future calculations.
Data mining has its advantages in allowing sharps to utilize historical trends, but is only a tool, and not a true system process.Comment -
PokerjoeSBR Wise Guy
- 04-17-09
- 704
#9Here's the basic conundrum: the smaller the set, the less valid the results, obv. But the further back in time you go to build a bigger set, the less relevant the data is to the current environment. And game environments change in subtle ways. It isn't only things as obvious as the NFL adopting the 2pt conversion.
Often, things work (say, certain passing strategies) only until other coaches see that they work and adopt countering defensive measures. No rule change, just, first, a change of offensive strategy (maybe leading you to say, hey, teams with this offensive strategy/statistical pattern have covered like crazy for the last two years!) followed by a corresponding and countering change in defensive strategy (which is implemented just as you start betting on the offensive strategy).
There is no Holy Grail.Comment -
DRZSBR Wise Guy
- 02-24-10
- 918
#10lots of systems out there tough to choose oneComment -
WrecktangleSBR MVP
- 03-01-09
- 1524
#11Here's the basic conundrum: the smaller the set, the less valid the results, obv. But the further back in time you go to build a bigger set, the less relevant the data is to the current environment. And game environments change in subtle ways. It isn't only things as obvious as the NFL adopting the 2pt conversion.
Often, things work (say, certain passing strategies) only until other coaches see that they work and adopt countering defensive measures. No rule change, just, first, a change of offensive strategy (maybe leading you to say, hey, teams with this offensive strategy/statistical pattern have covered like crazy for the last two years!) followed by a corresponding and countering change in defensive strategy (which is implemented just as you start betting on the offensive strategy).
There is no Holy Grail.
As for a holy grail, the time machine in the movie Back to the Future worked.Comment -
ZombieWolverineSBR Sharp
- 06-05-10
- 306
#12Then need to have a new remake on that movie , I think that would be sweet ,Comment -
mathdotcomSBR Posting Legend
- 03-24-08
- 11689
#13Answer is still the same.
There is no way to eliminate the tradeoff between small sample issues and introducing biased data.
First thing to do is estimate it both ways and see if it even differs. You may not have a problem after all.Comment -
Flying DutchmanSBR MVP
- 05-17-09
- 2467
#14Comment -
roasthawgSBR MVP
- 11-09-07
- 2990
#15As much as I like the NFL, with it's short season compared to other sports this is the dominant issue. On longer season sports the market learns about the teams, and where you might have an advantage early on, it can erode by the end of the season. NBA shows this pattern nearly every year.
As for a holy grail, the time machine in the movie Back to the Future worked.
As to your point about the NBA, one thing that I've noticed is that the "eroded early season edge" returns in the playoffs... imo this is due to the fact that there is more "public" money on the games come playoff time so it's profitable for the books to have a lean.Comment
SBR Contests
Collapse
Top-Rated US Sportsbooks
Collapse
#1 BetMGM
4.8/5 BetMGM Bonus Code
#2 FanDuel
4.8/5 FanDuel Promo Code
#3 Caesars
4.8/5 Caesars Promo Code
#4 DraftKings
4.7/5 DraftKings Promo Code
#5 Fanatics
#6 bet365
4.7/5 bet365 Bonus Code
#7 Hard Rock
4.1/5 Hard Rock Bet Promo Code
#8 BetRivers
4.1/5 BetRivers Bonus Code