So, I'm getting ready to embark on trying to develop a system. I've got a masters degree in electrical engineering (specifically communications), so I'll be putting a lot of my probability, random processes, etc. knowledge that I've gained (from the topic of noise in a communications system) to work here.
I've been thinking about this problem for a few years now, working on it sporadically, but have one fundamental problem: what to use as a metric?
As I start to try and model a team's % chance of winning any particular game, how do I know if a change I make helps or hurts my system?
I have just two thoughts so far:
1) Bin my predictions: Create bins of winning percentage, where I take all the games that I predict the favorite to win 50-52.5% of the time, 52.5-55% of the time, etc... and calculate the actual % of games the favorite (my favorite) wins for each of these bins. Ideally each bin's actual winning percentage would fall dead in the middle of each bin.
2) Actually track it. Paper bet the system and see if a change increases or decreases my profitability.
The flaws I see in each are:
1) Relying on bins makes the assumption that I'm accounting for all of the predictive variables. If I'm not (which of course I won't be), all the games in each bin are no longer "common," and these other factors will screw with the actual %win in each bin. So, what will happen is I will end up betting those games that actually have a more favorable line (because their win % is actually lower than the bin would suggest), and not betting the games that have a less favorable line (as their win % will actually be higher than the bin suggests). This is exactly the opposite of what a winning system will do, so I'm thinking this is not such a good idea.
2) In order to gain a significant number of results, I'll be relying on the validity of posted lines. I've been logging my own lines since mid June of last season, so I know those ones are accurate, but before that will be relying on covers.com lines. Also, it will be less obvious how any particular change affects the system, as I'll be relying on 1 single output (my return), which does not give much feedback as to what I screwed up or helped.
So, I'm wondering if there are any ideas as to better metrics that will guide me along in this process. Perhaps the answer is no, and that's what makes this so much fun
(Note: I've read Ganch's post on Z-scores and fully intend on using it, as well as seen this thread, and between them maybe they give all the answers to this question that there are...)
I've been thinking about this problem for a few years now, working on it sporadically, but have one fundamental problem: what to use as a metric?
As I start to try and model a team's % chance of winning any particular game, how do I know if a change I make helps or hurts my system?
I have just two thoughts so far:
1) Bin my predictions: Create bins of winning percentage, where I take all the games that I predict the favorite to win 50-52.5% of the time, 52.5-55% of the time, etc... and calculate the actual % of games the favorite (my favorite) wins for each of these bins. Ideally each bin's actual winning percentage would fall dead in the middle of each bin.
2) Actually track it. Paper bet the system and see if a change increases or decreases my profitability.
The flaws I see in each are:
1) Relying on bins makes the assumption that I'm accounting for all of the predictive variables. If I'm not (which of course I won't be), all the games in each bin are no longer "common," and these other factors will screw with the actual %win in each bin. So, what will happen is I will end up betting those games that actually have a more favorable line (because their win % is actually lower than the bin would suggest), and not betting the games that have a less favorable line (as their win % will actually be higher than the bin suggests). This is exactly the opposite of what a winning system will do, so I'm thinking this is not such a good idea.
2) In order to gain a significant number of results, I'll be relying on the validity of posted lines. I've been logging my own lines since mid June of last season, so I know those ones are accurate, but before that will be relying on covers.com lines. Also, it will be less obvious how any particular change affects the system, as I'll be relying on 1 single output (my return), which does not give much feedback as to what I screwed up or helped.
So, I'm wondering if there are any ideas as to better metrics that will guide me along in this process. Perhaps the answer is no, and that's what makes this so much fun

(Note: I've read Ganch's post on Z-scores and fully intend on using it, as well as seen this thread, and between them maybe they give all the answers to this question that there are...)