Ok, here's a question that's driving me bonkers. I'm modeling baseball moneylines and am BTCL 67.2% over 336 plays (an outlier subset from the 2700 game test set). The average line move (in implied probability) from open to close is over 1% in agreement with this model. Also, these data are 100% clean- that much I'm sure of.
But the ROI for these plays is underwater at a miserable -2.6%, when theory dictates it should be somewhere around 2-3%. This is using a flat betting approach, where I'm wagering to win 1 unit (on the opening ML), regardless of odds. I haven't bothered to derive a p value for seeing these stats, but I'm sure it must be in the 0.01 range.
Are there any other explanations for this other than just being extremely unlucky? The wagers are pretty evenly distributed among favorites and dogs; no crazy skew or anything. I saw the BTCL numbers first and thought "great." Then I looked at the ROI and wanted to throw my laptop out the window. A 0% ROI I could believe, but this seems too bad to be possible.
Weirdly, the totals are just as sharp (if not more so), but the ROI is much more in line with what I expect.
I know the only satisfying answer is "more data," but in the meantime is there anything else I'm missing? For anyone who models MLB, do you think these types of numbers make any sense for an unlucky stretch, or should I be digging deeper?
But the ROI for these plays is underwater at a miserable -2.6%, when theory dictates it should be somewhere around 2-3%. This is using a flat betting approach, where I'm wagering to win 1 unit (on the opening ML), regardless of odds. I haven't bothered to derive a p value for seeing these stats, but I'm sure it must be in the 0.01 range.
Are there any other explanations for this other than just being extremely unlucky? The wagers are pretty evenly distributed among favorites and dogs; no crazy skew or anything. I saw the BTCL numbers first and thought "great." Then I looked at the ROI and wanted to throw my laptop out the window. A 0% ROI I could believe, but this seems too bad to be possible.
Weirdly, the totals are just as sharp (if not more so), but the ROI is much more in line with what I expect.
I know the only satisfying answer is "more data," but in the meantime is there anything else I'm missing? For anyone who models MLB, do you think these types of numbers make any sense for an unlucky stretch, or should I be digging deeper?