1. #1
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    How to mathmatically validate a "system\angle"

    Greetings,

    I've heard many different methods of looking at this question, so I thought I would ask on this forum and see what methods are common to sports betting.

    Say I have system\angle of the form "Bet on any team that has a certain set of characteristics and has moneyline odds of > some value". I'm sure you've all seen thousands of these, like "Bet on any team playing an opponent after winning against that opponent in the prior game and has moneyline odds of > +110".

    I used to just average the payouts from a series of system bets and demand at least 30 winners, but now I understand more about sampling and probability distributions, so I've moved to bootstrapping a confidence interval. But I'm still unsure whether to just look at the previous x games, or keep looking over all the available data. And "streakiness" still concerns me.

    So, what would you consider the "gold standard" (mathmatically and statistically valid) test in order to begin betting on an angle? Would you look at all available games, or just recent games? How far back would be reasonable (5 years, 10 years?). How many games would you need to see results for and over what time frame?

    All thoughts welcome. It would be nice to have a standard test to apply to new angles.

    Regards,
    Philip

  2. #2
    suicidekings
    Update your status
    suicidekings's Avatar Become A Pro!
    Join Date: 03-23-09
    Posts: 9,962

    There's no general solution to this question. More data is good as long as you can qualify the value of that data by checking that the surrounding criteria is unchanged. You might see value in the dog ATS under two particular conditions (eg: at home, and coming off a home win) but just because a sample set of 50 games looks promising, it doesn't mean that the books haven't also noticed the trend and adjusted the lines accordingly in the last 20 games of that sample to remove the value in the line. One particular system might only have 2 conditions but there could be 10 criteria you need to look at to verify the validity of your data as to whether or not the angle is really a viable play or just a historical anomaly that is no longer relevant.

  3. #3
    Dark Horse
    Deus Ex Machina
    Dark Horse's Avatar Become A Pro!
    Join Date: 12-14-05
    Posts: 13,764

    Quote Originally Posted by podonne View Post
    Greetings,

    I've heard many different methods of looking at this question, so I thought I would ask on this forum and see what methods are common to sports betting.

    Say I have system\angle of the form "Bet on any team that has a certain set of characteristics and has moneyline odds of > some value". I'm sure you've all seen thousands of these, like "Bet on any team playing an opponent after winning against that opponent in the prior game and has moneyline odds of > +110".

    I used to just average the payouts from a series of system bets and demand at least 30 winners, but now I understand more about sampling and probability distributions, so I've moved to bootstrapping a confidence interval. But I'm still unsure whether to just look at the previous x games, or keep looking over all the available data. And "streakiness" still concerns me.

    So, what would you consider the "gold standard" (mathmatically and statistically valid) test in order to begin betting on an angle? Would you look at all available games, or just recent games? How far back would be reasonable (5 years, 10 years?). How many games would you need to see results for and over what time frame?

    All thoughts welcome. It would be nice to have a standard test to apply to new angles.

    Regards,
    Philip
    Basically, those are all nonsense. Data mining where artificial parameters are set so that a small sample size looks as if there's value. The only time they may work, going forward, is if you can find a strong motivational correlation that preferably goes against public perception. Example. Play on NFL team after they lose by at least 20 pts. That may work, because the team will often be highly motivated in the next game, while the public thinks little of the team. But you would still have to qualify it further. If you can't identify any clear reason for the 'wonderful' angle, consider it useless. People that sell these angles are well aware of their uselessness, but realize that a large chunk of the public can be made to believe in and pay cash for them.
    Last edited by Dark Horse; 07-10-11 at 02:38 AM.

  4. #4
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    Quote Originally Posted by suicidekings View Post
    There's no general solution to this question. More data is good as long as you can qualify the value of that data by checking that the surrounding criteria is unchanged. You might see value in the dog ATS under two particular conditions (eg: at home, and coming off a home win) but just because a sample set of 50 games looks promising, it doesn't mean that the books haven't also noticed the trend and adjusted the lines accordingly in the last 20 games of that sample to remove the value in the line. One particular system might only have 2 conditions but there could be 10 criteria you need to look at to verify the validity of your data as to whether or not the angle is really a viable play or just a historical anomaly that is no longer relevant.
    This is part of my troubles with recency. If I have a system with 300 plays over the last year and a positive EV, then I look at just the last 30 plays and it shows a negative EV. There are two ways to interpret that:

    1) The angle was positive in the past but has now turned negative because of some change in the game, so no bet
    2) The angle is positive over the long run so short run negative EVs are common. The system's performance can be expected to revert to the mean, therefore you should bet.

    I just don't know how to evaluate that mathematically. Even if there is no "general" solution, that's perfectly fine so long as we can approach it scientifically to figure out what those other parameters are.

  5. #5
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    Quote Originally Posted by Dark Horse View Post
    Basically, those are all nonsense. Data mining where artificial parameters are set so that a small sample size looks as if there's value. The only time they may work, going forward, is if you can find a strong motivational correlation that preferably goes against public perception. Example. Play on NFL team after they lose by at least 20 pts. That may work, because the team will often be highly motivated in the next game, while the public thinks little of the team. But you would still have to qualify it further. If you can't identify any clear reason for the 'wonderful' angle, consider it useless. People that sell these angles are well aware of their uselessness, but realize that a large chunk of the public can be made to believe in and pay cash for them.
    The phenomena you're talking about is called "overfitting" and is common to all types of learning data models, not just angles. Yes, it is a concern, and yes, it can be used by shady characters to sell fake angles to ignorant people, but that doesn't automatically make every angle found by such a process nonsense. That was my point in starting the thread and generate discussion about how to tell if an angle is nonsense or not?

    You've said that small samples size would be a concern. I would agree with that. What sample size would make you suspicious? Statistics would insist on a sample size of at least 30, but I actually think you need a sample size of 30 winners, so somewhere between 50 and 70 depending on your strike rate. Then again, you have to ask the question about what time period. 70 bets over 10 years probably isn't enough. 5 years? 3?

    You also talked about motivational factors being important. That's an interesting one. We could describe that as requiring that any system must include at least one characteristic that is "motivational" in nature. Like "lost the last game as a favorite", since the team might be motivated by losing when they "should have" won.

    That's the kind of discussion I hope to see. Your best friend brings you a "sure thing" angle for free. Assuming you don't just slam the door on anyone and everyone no matter what, what would you ask about the angle to validate that it works before using it?

  6. #6
    bztips
    bztips's Avatar Become A Pro!
    Join Date: 06-03-10
    Posts: 283

    Hate to agree with DH here (since I trashed him in another thread!), but yes, his answer to your question of "how to tell if an angle is nonsense or not" is the essential point: if you can't identify a very rational explanation for WHY the angle you've discovered exists, then it's probably useless.

  7. #7
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    Quote Originally Posted by bztips View Post
    Hate to agree with DH here (since I trashed him in another thread!), but yes, his answer to your question of "how to tell if an angle is nonsense or not" is the essential point: if you can't identify a very rational explanation for WHY the angle you've discovered exists, then it's probably useless.
    Don't get me wrong, I'm not disagreeing with DH, I'm just saying that "the angle is not nonsense" is just another requirement for the angle to be good, just like requiring the angle to show a positive EV.

    So if there is a checklist, it might go something like this:

    1) Shows a positive EV for at least 100 bets
    2) Makes some logical sense
    3) Includes a factor that could be considered "motivational"
    4) ...
    n)

    Just because data mining came up with an angle doesn't automatically make it invalid.

  8. #8
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar Become A Pro!
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    Quote Originally Posted by podonne View Post
    Just because data mining came up with an angle doesn't automatically make it invalid.
    Umm. How exactly do you expect to validate it mathematically if it's derived via data mining?

    Your list of subjective qualifiers isn't math.

  9. #9
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    Quote Originally Posted by MonkeyF0cker View Post
    Umm. How exactly do you expect to validate it mathematically if it's derived via data mining?
    That's the question I'm asking. I have some ways I do it, but I'm curious how other people go about proving that the angle is valid. I don't think having that information public is going to hurt anyone's ROI but it would be a tremendous help to newbies, keep them out of trouble. Like me. :-)
    Your list of subjective qualifiers isn't math.
    There are many examples of using a set of qualifiers to select a subset of a population with different characteristics. see: http://en.wikipedia.org/wiki/Classifier_(mathematics), also http://en.wikipedia.org/wiki/Bayesian_spam_filtering for a practical example. It is, indeed, math.

  10. #10
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar Become A Pro!
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    Quote Originally Posted by podonne View Post
    That's the question I'm asking. I have some ways I do it, but I'm curious how other people go about proving that the angle is valid. I don't think having that information public is going to hurt anyone's ROI but it would be a tremendous help to newbies, keep them out of trouble. Like me. :-)

    There are many examples of using a set of qualifiers to select a subset of a population with different characteristics. see: http://en.wikipedia.org/wiki/Classifier_(mathematics), also http://en.wikipedia.org/wiki/Bayesian_spam_filtering for a practical example. It is, indeed, math.
    So you're trying to apply classification probability to validating forecast probability? Good luck with that.

    Apples and oranges.

    By the way, as for your subjective list, tell me how you quantify the probability that a factor is motivational so you can input that into Bayes. Thanks.
    Last edited by MonkeyF0cker; 07-11-11 at 02:23 AM.

  11. #11
    Dark Horse
    Deus Ex Machina
    Dark Horse's Avatar Become A Pro!
    Join Date: 12-14-05
    Posts: 13,764

    Quote Originally Posted by podonne View Post
    Just because data mining came up with an angle doesn't automatically make it invalid.
    True. Data mining, combined with critical thinking and void of a desire to see things that aren't there, is not unscientific. And an 80% miracle angle that, upon closer investigation, turns out to be 'just' 54% is still valuable. As long as you realize you're about to plant your future crops in a minefield.

    The first thing you would have to do is throw out the data that were part of identifying the angle. Then you can see if it works going forward. As soon as you make the slightest adjustment in this forward process, all the old data are out again. (Theoretically, when millions of combinations are scanned by software, and contrary to what I said earlier, it wouldn't even be necessary to understand the underlying cause, as long as the pattern continued after you recognized it. However, I don't think that applies to sports betting, where sample sizes never reach those type of levels).

    The angles you talk about are served up as conclusions. When someone present you with a conclusion wouldn't you want to know how he reached it? Cause and effect. Chances are you might be shocked to know what you bet your money on if the angle identifier revealed his thought process. So why would you want to put all this work into tracing the -mostly false- conclusions of others to underlying causes? Why not instead start with a hypothesis of your own, a reasonable expectation that a certain circumstance may produce a certain effect? Could save a lot of time chasing shadows.

  12. #12
    chunk
    chunk's Avatar Become A Pro!
    Join Date: 02-08-11
    Posts: 805
    Betpoints: 19168

    Dark Horse has given some good advice. If one relies on published angles that are easily accessible for free, they are for the most part useless imho. I know it's simplistic, but here's my 2 cents:

    1. Find your own angles.
    2. Back test.
    3. Test in real time.
    4. If desired hit rates are achieved, validate by kicking book's arse.

  13. #13
    Pancho sanza
    Pancho sanza's Avatar Become A Pro!
    Join Date: 10-18-07
    Posts: 386

    Quote Originally Posted by podonne View Post
    Don't get me wrong, I'm not disagreeing with DH, I'm just saying that "the angle is not nonsense" is just another requirement for the angle to be good, just like requiring the angle to show a positive EV.

    So if there is a checklist, it might go something like this:

    1) Shows a positive EV for at least 100 bets
    2) Makes some logical sense
    3) Includes a factor that could be considered "motivational"
    4) ...
    n)

    Just because data mining came up with an angle doesn't automatically make it invalid.
    Change # 1 to 1000 minimum

  14. #14
    Formulawiz
    Formulawiz's Avatar Become A Pro!
    Join Date: 01-12-09
    Posts: 1,589

    http://www.sportrends.com/pfoverview.htm

    Check this out for pro football. This software gives you the opportunity to utilize both trend and situation analysis. I find this very useful for both college and pro football. I am not sure if this is what you are looking to do.

  15. #15
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    Quote Originally Posted by Pancho sanza View Post
    Change # 1 to 1000 minimum
    That's quite a lot of bets. :-) What's the thinking behind this number? Over what time frame would you want to see the 1000 bets? 5 years? 3?

    Thanks,
    Philip

  16. #16
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    Quote Originally Posted by chunk View Post
    Dark Horse has given some good advice. If one relies on published angles that are easily accessible for free, they are for the most part useless imho. I know it's simplistic, but here's my 2 cents:

    1. Find your own angles.
    2. Back test.
    3. Test in real time.
    4. If desired hit rates are achieved, validate by kicking book's arse.
    Totally agree, published angles are nearly always worthless.

    Its not (or shouldn't be) strictly neccessary to test in real time in order to validate the hit rate. All you'd need to do is restrict the data from step 2) so that it only includes games more than two weeks ago, and then step 3 could be "3) simulate the last two weeks" using the two weeks of unseen data. You'll know how the angle would have performed had you found it too weeks ago and bet it over the last two weeks, but you won't have put any money at risk.

    The hard part is writing the specific process behind those steps. Is two weeks the right time? How positive is positive to bet on it, etc...

  17. #17
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    Quote Originally Posted by MonkeyF0cker View Post
    So you're trying to apply classification probability to validating forecast probability? Good luck with that. Apples and oranges.
    Well, I don't think I'm doing that. I'm using a classification process to identify sub-populations of a larger population that are a statistically significantly better bet that the entire population. The question then is how to "confirm" that the sub-population is a better bet.

    Quote Originally Posted by MonkeyF0cker View Post
    By the way, as for your subjective list, tell me how you quantify the probability that a factor is motivational so you can input that into Bayes. Thanks.
    Sorry, I didn't understand what you meant. The notion of being motivational applies to the angle, as in, "the angle must consider at least one factor that is motivational", there's no probability. All I meant was that as the very last step, you've got the angle totally validated, all the math points to positive, the last thing you do is look at the angle and make sure you consider one of the factors to be motivational.

    For the sake of clairity, lets assume that you've look at the list of factors for a certain angle and notice that, yes, it does include a motivational factor (like "lost by > 20 in last game"). What else would you look at to validate the angle?

  18. #18
    podonne
    podonne's Avatar Become A Pro!
    Join Date: 07-01-11
    Posts: 104

    Quote Originally Posted by Dark Horse View Post
    True. Data mining, combined with critical thinking and void of a desire to see things that aren't there, is not unscientific. And an 80% miracle angle that, upon closer investigation, turns out to be 'just' 54% is still valuable. As long as you realize you're about to plant your future crops in a minefield.
    This is exactly the conversation I was hoping to have in this thread. A reasonable, scientific approach to validating an angle, free of subjectivity that can come from "a desire to see things that aren't there". :-)

    [quote=Dark Horse;10748068]The first thing you would have to do is throw out the data that were part of identifying the angle. Then you can see if it works going forward. As soon as you make the slightest adjustment in this forward process, all the old data are out again. (Theoretically, when millions of combinations are scanned by software, and contrary to what I said earlier, it wouldn't even be necessary to understand the underlying cause, as long as the pattern continued after you recognized it. However, I don't think that applies to sports betting, where sample sizes never reach those type of levels).[/quote/


    Great point. I'd agree that the first step should be to split the data into 2 sets, one for angle finding and one for validating.
    • How large should the two sets be? (50% in each, or 66% and 33%)
    • How should the data be divided? (Split bases on time, so years 1 and 2 are in the finding set, and 3 and 4 are in the validating set, or split based on randomly choosing a certain percent of the games)
    • How "close" do the +EV values need to be between the two sets (just both need positive, within 1%, etc)
    Also, I love your line "as long as the pattern continued after you recognized it" as a very elegant description of the problem. What would you look at in a series of angle bets to identify that?

    Quote Originally Posted by Dark Horse View Post
    The angles you talk about are served up as conclusions. When someone present you with a conclusion wouldn't you want to know how he reached it? Cause and effect. Chances are you might be shocked to know what you bet your money on if the angle identifier revealed his thought process. So why would you want to put all this work into tracing the -mostly false- conclusions of others to underlying causes? Why not instead start with a hypothesis of your own, a reasonable expectation that a certain circumstance may produce a certain effect? Could save a lot of time chasing shadows.
    A very valid point, but with a standardized, non-subjective process for validaing an angle, the work is very, very small. It takes less than a second to calculate the simple EV\bet of an angle.

  19. #19
    BDiddy
    BDiddy's Avatar Become A Pro!
    Join Date: 01-07-11
    Posts: 21

    I read alot and post little. Simply put, my success comes in the NBA and in simple form. I follow ALL teams, print out schedules ahead of time, study the schedules for what I feel is important, apply a simple mathmatical analysis to each match up and beat the books at a good consistent rate. I will not get rich off of this way of handicapping games, but it is slow and progressive and most of all PROFITABLE and CONSISTENT. There are numerous trends to follow in the NBA and if you put forth a little effort you will find them. This is my simplest form of handicapping that I apply to my more complicated mathematical models. The problem with me identifying trends and playing those trends is that it is hard to quantify an actual edge on each game, so I bet a fixed percentage on each game because I KNOW that the trend will eventually dry up. When will it dry up? Depends on alot of factors that you have to discover yourself. I started from complete scratch and have learned and built my bankroll through consistent HARD WORK. I spend roughly 2-4hrs a night EVERY night during the NBA season on studying and betting trends and hit consistent 58%. However, do not be fooled by the high win % as like I mentioned I can only bet small amounts on most trends because I can not quanitify my exact edge, therfore there is a cieling on the potential growth of my bankroll. I know more mathmatical handicappers will bash this, so take it for what its worth. I will never bet a large amount of my bankroll, (more than 2%) on this handicapping style. This is a point I feel many ppl forget to understand. Each "system" or "formula" can provide valuable info to help you build your bankroll... just because I cannot quantify my actual edge doesn't mean I cannot be profitable in the long run... I HAVE BEEN. Good luck to you all! And to clarify I am not refering to placing a wager bc the Bulls are 14-2 ATS off b2b road games in the month of January. That is not my style.

    Also you must pay attention to EVERYTHING when following trends... you will pick up on when they will be drying up and then you can start playing against those trends. I'm not sure I would even call what I follow to be "trends". PLEASE understand this my simplest form of handicapping and I am a firm believer if you want to do this for a living you MUST have firm mathmatical models in place. I use much more extensive models to quantify my actual edge in games and enable myself to bet larger amounts of my bankroll because I know my actual edge. Everything mentioned here is just a tool. Put them all together and I get more positive volume. My mathmatical models are not good enough right now for me to stop betting the smaller amounts on my "trend" bets. Hope this makes sense to ya'll...

    When I say my models are not good enough, I mean that they do not identify a high enough volume of games for me to play... So my other handicapping styles supplement one another in terms of volume.

    I have several methods that retuen positive value, but they are all limited in volume, so I add them together and my positive volume increases
    Last edited by BDiddy; 07-11-11 at 04:50 PM.

  20. #20
    Pokerjoe
    Pokerjoe's Avatar Become A Pro!
    Join Date: 04-17-09
    Posts: 704
    Betpoints: 307

    Multiple problems arise. I don't think I'm saying anything new ITT, just differently, but that's okay, because sometimes different re-statements can be useful.

    A) If you look at 1,000 angles and find one with a 1 in a 1000 chance of being random, you haven't found anything. But guys do that. They'll say "wow, mathematically, there's only 1 chance in a 1000 that this result is just random!"

    B) If you look at historical results, the further back you look, the less relevant the environment is to today's environment.

    C) The things you find, everyone finds. If anyone can find it, it'll quickly be in the line, and the edge is gone.

    D) If you're relying on angles, without score or win chance estimates, you can't tell whether your assumed generic edge is now in the line. And no, trimming your angle to include "Faves of less than 7" or "dogs of +120 or more" doesn't negate this problem.

    E) "Motivation" is going to be very tricky to quantify. The games are like fingerprints. The situations are. You'll never include more than a few of the infinite number of variables available. You can't think "this situation, where Team X has variables a, b, and c going for it" is just like "that situation where Team Y had variables a, b, and c going for it." Team Y's variables were actually a, b, c ..... z, 3z, etc, and thus it's relation to Team X's situation is very illusory.
    Last edited by Pokerjoe; 07-13-11 at 11:03 AM.

  21. #21
    wrongturn
    Update your status
    wrongturn's Avatar Become A Pro!
    Join Date: 06-06-06
    Posts: 2,228
    Betpoints: 3726

    good info in this thread. deep stuff.

  22. #22
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar Become A Pro!
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    Quote Originally Posted by podonne View Post
    Well, I don't think I'm doing that. I'm using a classification process to identify sub-populations of a larger population that are a statistically significantly better bet that the entire population. The question then is how to "confirm" that the sub-population is a better bet.
    Are you? Again, how do you know that they are statistically significant when they are data mined?


    Sorry, I didn't understand what you meant. The notion of being motivational applies to the angle, as in, "the angle must consider at least one factor that is motivational", there's no probability. All I meant was that as the very last step, you've got the angle totally validated, all the math points to positive, the last thing you do is look at the angle and make sure you consider one of the factors to be motivational.

    For the sake of clairity, lets assume that you've look at the list of factors for a certain angle and notice that, yes, it does include a motivational factor (like "lost by > 20 in last game"). What else would you look at to validate the angle?
    We come full circle. That is SUBJECTIVE. You are not quantifying motivation. You're simply considering it an absolute based on some arbitrary qualifier that you pulled out of a hat.

  23. #23
    Dark Horse
    Deus Ex Machina
    Dark Horse's Avatar Become A Pro!
    Join Date: 12-14-05
    Posts: 13,764

    Quote Originally Posted by podonne View Post
    Great point. I'd agree that the first step should be to split the data into 2 sets, one for angle finding and one for validating.
    • How large should the two sets be? (50% in each, or 66% and 33%)
    • How should the data be divided? (Split bases on time, so years 1 and 2 are in the finding set, and 3 and 4 are in the validating set, or split based on randomly choosing a certain percent of the games)
    • How "close" do the +EV values need to be between the two sets (just both need positive, within 1%, etc)
    Also, I love your line "as long as the pattern continued after you recognized it" as a very elegant description of the problem. What would you look at in a series of angle bets to identify that?

    Typically, you would use a Z-score. (for the clean sample). (Wins-losses)/sqrt sample size. That will give you the standard deviation. Anything over 2 is valid. It's interesting to compare this to percentages. For instance, a 100-80 W/L record has a 1.49 Z-score, but 200-160 W/L has a 2.11 Z-score. The percentage is the same. So you could take different percentages (51-52-53-54-55-56-etc) and see what sample sizes are required for different standard deviations.

    Doesn't matter how the samples are divided. The only sample you're interested in is the clean sample. Anybody can pick a few perfect fruits from a large orchard. Nice for the consumer, but doesn't mean that the seeds of those fruits have any added value. They could have. You're working with those seeds and have no idea how much of that perfection (of the previous generation) was due to build-in resistance to negative conditions, and how much of it was due to ideal external conditions.

Top