sports modeling approaches

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • TomG
    SBR Wise Guy
    • 10-29-07
    • 500

    #36
    I would have thought hockey was too "fluid" to be modeled using a discrete model such as a Markov Chain. However this article presents a simple Markov Chain based model for modeling an NHL game. The basic idea could then be adapted using a database of NHL GameCenter's PBP Data (http://www.nhl.com/scores/htmlreport...0/PL020024.HTM) to create your own NHL simulators (this perhaps could also be applied to soccer).

    Comment
    • Indecent
      SBR Wise Guy
      • 09-08-09
      • 758

      #37
      A common recommendation for sports modeling is to leave out a few seasons to use to make sure you aren't overfitting your data.

      With that in mind, what do you look at to make sure the system is not overfitting? Overall accuracy and accuracy against the spread are obvious considerations, but what about comparing the prediction percentages of the home team winning outright, etc?

      How important is consistency among these values in your model? What other considerations do you make when evaluating your model?
      Comment
      • Indecent
        SBR Wise Guy
        • 09-08-09
        • 758

        #38
        OK, how about a different question...

        Say you had to pick between a system that was very good at predicting against the spread and one that was very good at predicting straight up winner.

        Which would you select?
        Comment
        • TomG
          SBR Wise Guy
          • 10-29-07
          • 500

          #39
          Originally posted by Indecent
          OK, how about a different question...

          Say you had to pick between a system that was very good at predicting against the spread and one that was very good at predicting straight up winner.

          Which would you select?
          Whichever wager had the higher expected growth. There may be a slight preference for the favorite's ML and dog's points. In general, though, I would prefer the point spread.

          However your question is mostly irrelevant--A system that was very good ATS could relatively easily be modified to also be very good straight up (and vice-versa).
          Comment
          • Indecent
            SBR Wise Guy
            • 09-08-09
            • 758

            #40
            Thanks for your input. I'm not convinced the spread is best, especially if the system is good at picking up underdog winners, but I've done no math to backup this claim. I'm just trying to get my creative juices flowing in modifying my algorithm and was hoping for inspired conversation from some new people.

            Originally posted by TomG
            However your question is mostly irrelevant--A system that was very good ATS could relatively easily be modified to also be very good straight up (and vice-versa).
            That depends on how the system is constructed. I'm using an artificial intelligence technique that is not easily tweakable once trained, but I can customize how it is trained and what it learns. Consider these examples:

            One system is 74% straight up but misses by an average of 25pts a game. In this case it is focusing on learning how to predict the winner correctly, but predicting a blowout for every game rather than something close to the final score. It is hardly ever correct against the spread unless correctly predicting the underdog to win straight up.

            Another system is 58% straight up but 53% against the spread, with average points missed at 10.4. Obviously it's much better at approximating the final score, but correct about the winning side less frequently.

            For reasons I don't think matter in this discussion, the algorithm can't have both. It is either really good at predicting the final score, or really good at predicting the final winner, or mediocre at learning to combine both skillsets.

            The question then becomes whether one is inherently higher +EV than the other, or if it depends the underdog accuracy.
            Comment
            • TomG
              SBR Wise Guy
              • 10-29-07
              • 500

              #41
              What I mean is that either approach will work. Once you have a team's expected Win% you can use a ML --> Spread converter (or create your own) to calculator a fair spread for that Win%. On the other hand, if you have an expected fair spread, then you can convert that to an expected Win%.

              If there is value, it will almost always be on one side. Sometimes taking that team's point spread will present a higher EV wager. Sometimes taking that teams money line will present the higher EV wager. In most cases, the point spread will present the higher EG situation (with preferences toward the favorites ML and dog points that I mentioned above).
              Comment
              • Indecent
                SBR Wise Guy
                • 09-08-09
                • 758

                #42
                Thanks for the clarification. I understand what you mean, but wonder if you are assuming the system is reporting an expected percentage on the outcome (it is not), or are you using the 74% historical accuracy?

                Either way it's a good way to approach the problem and has given me an idea or two. Thanks!
                Comment
                • luigi
                  SBR Rookie
                  • 08-29-09
                  • 32

                  #43
                  this is a very informative thread... i've never done a simulation before, i was wondering if simulations can be done with matrices on excel?? is programming code imperative? it sounds like that's the case from the above posts, but i just wanted to clarify this.

                  secondly, because we have a pitcher/batter matchup, how would one typically assign differing probabilities for outcomes given pitcher ability? clearly i'd need some years of learning before even attempting this.

                  thanks for any responses
                  Comment
                  • man3645
                    SBR Sharp
                    • 09-18-09
                    • 269

                    #44
                    Hey

                    watch justin7's blackbox modeling thing
                    Comment
                    • Wrecktangle
                      SBR MVP
                      • 03-01-09
                      • 1524

                      #45
                      Originally posted by luigi
                      this is a very informative thread... i've never done a simulation before, i was wondering if simulations can be done with matrices on excel?? is programming code imperative? it sounds like that's the case from the above posts, but i just wanted to clarify this.

                      secondly, because we have a pitcher/batter matchup, how would one typically assign differing probabilities for outcomes given pitcher ability? clearly i'd need some years of learning before even attempting this.

                      thanks for any responses
                      Excel with VBA can do things, especially under Vista 64 / Excel 2007 as the sheets are smaller and the code tighter. Probably will be even more bug free under Win 7.

                      The guys who do bases tell me that they simulate each hit, pitch, etc and run the sim at least 1000 iterations (Monte Carlo). But then they need to deal with last minute roster & pitcher changes. With bases have the "tightest line" you may want to start on another sport.
                      Comment
                      • luigi
                        SBR Rookie
                        • 08-29-09
                        • 32

                        #46
                        Originally posted by Wrecktangle
                        Excel with VBA can do things, especially under Vista 64 / Excel 2007 as the sheets are smaller and the code tighter. Probably will be even more bug free under Win 7.

                        The guys who do bases tell me that they simulate each hit, pitch, etc and run the sim at least 1000 iterations (Monte Carlo). But then they need to deal with last minute roster & pitcher changes. With bases have the "tightest line" you may want to start on another sport.
                        wow.. every pitch, that's pretty incredible. yeah, right now my knowledge would probably only be suitable for working on more simple ev types of models.

                        thanks for the reply
                        Comment
                        • luigi
                          SBR Rookie
                          • 08-29-09
                          • 32

                          #47
                          Originally posted by man3645
                          watch justin7's blackbox modeling thing
                          thanks, i have seen that before and i found it very helpful
                          Comment
                          • adamcm
                            SBR Rookie
                            • 08-09-09
                            • 31

                            #48
                            Good info here. I created a baseball simulation last year but only implemented it for the last couple of months of the season.

                            The biggest struggle I had was determining how to handle to runners on base. For instance if I had a guy on 2nd, and the batter hit a single...what percentage of the time does he score, what percentage does he end up on 3rd and what percentage is he thrown out? Is there any data out there giving averages?

                            Secondly, for those that do this, where do you get your data? Right now I just use Team batting stats for all situations: home, away, night, day, etc, but I don't break it down to the individual player. For pitching I use the individual pitcher stats, and a general bullpen. I grab my data by copying and pasting into Excel and running an application to update my database.

                            I'm looking for ways for improvement, so any ideas are greatly appreciated
                            Comment
                            • luigi
                              SBR Rookie
                              • 08-29-09
                              • 32

                              #49
                              Originally posted by adamcm
                              Good info here. I created a baseball simulation last year but only implemented it for the last couple of months of the season.

                              The biggest struggle I had was determining how to handle to runners on base. For instance if I had a guy on 2nd, and the batter hit a single...what percentage of the time does he score, what percentage does he end up on 3rd and what percentage is he thrown out? Is there any data out there giving averages?

                              Secondly, for those that do this, where do you get your data? Right now I just use Team batting stats for all situations: home, away, night, day, etc, but I don't break it down to the individual player. For pitching I use the individual pitcher stats, and a general bullpen. I grab my data by copying and pasting into Excel and running an application to update my database.

                              I'm looking for ways for improvement, so any ideas are greatly appreciated
                              good question.. try and see if hardballtimes has this data.. you could separate "long singles" or long doubles where you would stipulate 2 bases moved for runners on base..
                              there probably is a site that has this.. i'll have to check, i remember hardball times had different hitter data that differentiated types of fly balls
                              Comment
                              • shantystar
                                SBR Hall of Famer
                                • 11-13-05
                                • 7299

                                #50
                                Originally posted by TomG
                                I would have thought hockey was too "fluid" to be modeled using a discrete model such as a Markov Chain. However this article presents a simple Markov Chain based model for modeling an NHL game. The basic idea could then be adapted using a database of NHL GameCenter's PBP Data (http://www.nhl.com/scores/htmlreport...0/PL020024.HTM) to create your own NHL simulators (this perhaps could also be applied to soccer).

                                http://germain.umemat.maine.edu/facu...s/zaman-01.pdf
                                nice links
                                Comment
                                SBR Contests
                                Collapse
                                Top-Rated US Sportsbooks
                                Collapse
                                Working...