Finding a Metric

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tweek
    SBR Hustler
    • 02-17-09
    • 60

    #1
    Finding a Metric
    So, I'm getting ready to embark on trying to develop a system. I've got a masters degree in electrical engineering (specifically communications), so I'll be putting a lot of my probability, random processes, etc. knowledge that I've gained (from the topic of noise in a communications system) to work here.

    I've been thinking about this problem for a few years now, working on it sporadically, but have one fundamental problem: what to use as a metric?

    As I start to try and model a team's % chance of winning any particular game, how do I know if a change I make helps or hurts my system?

    I have just two thoughts so far:

    1) Bin my predictions: Create bins of winning percentage, where I take all the games that I predict the favorite to win 50-52.5% of the time, 52.5-55% of the time, etc... and calculate the actual % of games the favorite (my favorite) wins for each of these bins. Ideally each bin's actual winning percentage would fall dead in the middle of each bin.

    2) Actually track it. Paper bet the system and see if a change increases or decreases my profitability.

    The flaws I see in each are:

    1) Relying on bins makes the assumption that I'm accounting for all of the predictive variables. If I'm not (which of course I won't be), all the games in each bin are no longer "common," and these other factors will screw with the actual %win in each bin. So, what will happen is I will end up betting those games that actually have a more favorable line (because their win % is actually lower than the bin would suggest), and not betting the games that have a less favorable line (as their win % will actually be higher than the bin suggests). This is exactly the opposite of what a winning system will do, so I'm thinking this is not such a good idea.

    2) In order to gain a significant number of results, I'll be relying on the validity of posted lines. I've been logging my own lines since mid June of last season, so I know those ones are accurate, but before that will be relying on covers.com lines. Also, it will be less obvious how any particular change affects the system, as I'll be relying on 1 single output (my return), which does not give much feedback as to what I screwed up or helped.

    So, I'm wondering if there are any ideas as to better metrics that will guide me along in this process. Perhaps the answer is no, and that's what makes this so much fun

    (Note: I've read Ganch's post on Z-scores and fully intend on using it, as well as seen this thread, and between them maybe they give all the answers to this question that there are...)
    Last edited by tweek; 05-13-09, 07:40 PM.
  • Bsims
    SBR Wise Guy
    • 02-03-09
    • 827

    #2
    This has always been a perplexing issue for me. If I have a "system", I'm most comfortable if I can back test the results. In this case I measure the return per dollar. If these previous results were used in developing the system, then the results are problematic.

    Back testing isn't easy because you need to be able to recreate the data you're using on each day and apply it to the system to generate the results for that day. Sometimes, the only thing you can do is track the results forward.

    As I said, I use return/dollar as my primary metric since that is what I'm trying to optimize. I have done like you by putting probable winning percentage into "bins" and compare to the actual results in those bins.

    I've also assigned 1 when a home team wins and 0 when they lose. Then I compute the correlation between this and my predicted win probability. This is has been particularly useful when I'm tweeking the "system" (Note: the highest correlation here is usually the one between the odds and the winners. That's the correlation you need to exceed.)

    Finally, I find graphing the results (like those in the bins) and just eyeballing the results is particularly useful. I will try to post one such graph in my next reply.
    Comment
    • tweek
      SBR Hustler
      • 02-17-09
      • 60

      #3
      Originally posted by Bsims
      Back testing isn't easy because you need to be able to recreate the data you're using on each day and apply it to the system to generate the results for that day. Sometimes, the only thing you can do is track the results forward.
      I'm using a database of statistics broken down on a game-by-game basis, so I'll be able to calculate any stat going into any game that I need (sourced from retrosheet).

      Originally posted by Bsims
      I've also assigned 1 when a home team wins and 0 when they lose. Then I compute the correlation between this and my predicted win probability. This is has been particularly useful when I'm tweeking the "system" (Note: the highest correlation here is usually the one between the odds and the winners. That's the correlation you need to exceed.)
      That's an interesting idea. I'll give that a shot.
      Comment
      • Bsims
        SBR Wise Guy
        • 02-03-09
        • 827

        #4
        Sorry, but I'm not able to post a sample graph. But trust me, sometimes a picture is worth a thousand words. Recently I did a study on a system and over 20,000 games it was a very small money loser (returned $0.995). Since it was so close I decided to pursue it further. One thing I did was sort the games by increasing home line and plotted it. The net started a fairly step slope down and continued so through about half the data. Then it turned and made a move upward, finishing slightly below zero. But the picture clearly showed that there was a positive subset, and moreover where it started.
        Comment
        SBR Contests
        Collapse
        Top-Rated US Sportsbooks
        Collapse
        Working...