My First Baseball Model - Advice Appreciated

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • SparJMU
    SBR MVP
    • 02-18-10
    • 1648

    #1
    My First Baseball Model - Advice Appreciated
    Each time I hear advanced dicsussions of baseball handicapping it always leads to the fact that a statistical/mathematical model is the key to identifying value in lines. So this week I attempted to put together my first baseball model. I think you would call it a model, but its really just an excel file where I copy and paste a ton of data, and I have formulas set up to use basic stats like Runs Per Game and ERA to predict a final score. The only extra calculations I included involve weighting recent games more heavily, as well as teams' tendencies home/away and against RHP/LHP.

    I understand this is very basic, and either 1) Will take a ton of tweaking before it is of any value to me; or 2) May be too simple and will never really be of any value to me.

    First of all, any general advice as to whether I am heading in the right direction or wasting my time? And secondly, I have a few questions for the experts who might be willing to help me improve this "model".....

    1) Are there any advanced statistics that you believe hold more value and therefore should be included, and how might I implement them? For example I bring WHIP into my file, but I am not sure how to translate that into how many runs the pitcher is likely to allow?

    2) Let's say I eventually come up with something that proves to be fairly accurate. How do I compare my forecast with the Vegas line to determine if there is value there? The easy example: If I can accurately predict that two teams are evenly matched with a final projected score of 4-4, but one team if +180, obviously I have value in the underdog. But's let's say my projection tells me the home team will win 4-3, and Vegas line is -135. I have no idea if I have value?

    Thank you very much in advance to anyone who has experience with this and may help me turn this project into something of value.
  • jgilmartin
    SBR MVP
    • 03-31-09
    • 1119

    #2
    http://video.sbrforum.com/video-1268...oneylines.html - You will probably find this video useful
    Comment
    • roasthawg
      SBR MVP
      • 11-09-07
      • 2990

      #3
      Instead of attempting to predict the final score (4-3 in your example) you can instead try to predict the percentage of the time that the home/away team will win the game. If your calculations tell you that the home team will win the game 57% of the time you can then use kelly to determine how much you should bet on the game.
      Comment
      • IrishTim
        SBR Wise Guy
        • 07-23-09
        • 983

        #4
        Well you can put that final score through a modified pythagorean theorem to generate your win percentages.
        Comment
        • jgilmartin
          SBR MVP
          • 03-31-09
          • 1119

          #5
          Yes. And you can use http://www.sbrforum.com/Betting+Tool...Converter.aspx to go between win percentage and moneyline odds and vice versa. If you enter the oddsmaker's moneyline odds into the converter, you can get the implied probability, and then compare that to your expected winning percentage.
          Comment
          • TomG
            SBR Wise Guy
            • 10-29-07
            • 500

            #6
            Read Michael Murray's book on baseball.
            Comment
            • suicidekings
              SBR Hall of Famer
              • 03-23-09
              • 9962

              #7
              Originally posted by SparJMU
              Each time I hear advanced dicsussions of baseball handicapping it always leads to the fact that a statistical/mathematical model is the key to identifying value in lines. So this week I attempted to put together my first baseball model. I think you would call it a model, but its really just an excel file where I copy and paste a ton of data, and I have formulas set up to use basic stats like Runs Per Game and ERA to predict a final score. The only extra calculations I included involve weighting recent games more heavily, as well as teams' tendencies home/away and against RHP/LHP.
              I used to maintain a similar model for evaluating MLB games. Because baseball is so stat intensive with so many variables (SP, RP, team batting vs LHP/RHP, home/away, day/night, strength of recent opponents, etc), I found that the output could be very misleading if you're not very meticulous about describing each matchup accurately. If you have the database to support such a detailed approach, then great, however I feel like my MLB capping improved when I simplified my approach to working on a game by game basis.

              Now I start by screening games by:

              Starting Pitcher (all in terms of WHIP & BAA): Day/Night overall, Home/Away overall, Last 3 games, vs LH/RH hitters, expected IP.
              Relief Pitching: bullpen rest, WHIP & BAA
              Hitting: vs LHP/RHP, recent form (weighted by opposing SPs faced), specific matchup history vs SP

              ERA shouldn't really factor into capping in baseball is not an accurate metric of performance.
              Comment
              • SparJMU
                SBR MVP
                • 02-18-10
                • 1648

                #8
                Thanks for the tips guys. So what I picked up from all of this.....

                Continue using stats to predict a final score. However the most efficient way to do this would be to analyze each individual player in that days lineup as opposed to looking at the team as one unit (explained in Justin's video). I also ought to consider stats like WHIP and BAA as opposed to ERA. Then using that predicted score, calculate a winning percentage based on the pythagorean formula, and finall convert that to a money line. Got it.

                So my followup questions are.........

                1) Is there an efficient way to gather stats for each individual player and organize it? Right now I am copying and pasting one matchup sheet and that's the end of it. If I were to analyze every player I imagine it coule take a very long time.

                2) Where can I find stats like BAA?

                3) Suicide, how do I look at stats like WHIP and BAA and convert that to a predicted outcome? ERA is easy, take the game ERA and multiply it by the fraction of 9 innings I think the pitcher will play. However WHIP and BAA don't translate directly into runs do they? I am not sure how to go forward with that.

                Thanks a lot everyone.
                Comment
                • uva3021
                  SBR Wise Guy
                  • 03-01-07
                  • 537

                  #9
                  multiply whip by PI, that will give you close to the average overall ERA among starting pitcher
                  Comment
                  • suicidekings
                    SBR Hall of Famer
                    • 03-23-09
                    • 9962

                    #10
                    You have to express the runs for/against in terms of the other stats. For that you need to determine the correlation between the player stats (WHIP is essentially the inverse of OBP) and runs allowed. Take a look at this document. It discusses the use of the Pythagorean Expectation formula in terms of OBP & SLG (p.31) as opposed to just runs scored/allowed. By examining data from past seasons and using linear regression they determined suitable coefficients to use in the equation (p.20).

                    However if you're using this, you would need to adjust the simple equation shown to reflect both the starting pitcher and the bullpen, weighted by innings pitched. I just started playing with this myself, using data from Covers in the player stats section.
                    Last edited by suicidekings; 07-05-10, 01:10 AM.
                    Comment
                    • SparJMU
                      SBR MVP
                      • 02-18-10
                      • 1648

                      #11
                      Great, thanks a lot. I am going to work with this for the next few days and see what happens.
                      Comment
                      • Waz
                        SBR Sharp
                        • 12-25-08
                        • 262

                        #12
                        Building a good baseball model that actually has good predictive power takes quite a long time. I've been working on mine for almost 10 years and I still make tweaks to it every single season. The difficult part is balancing the amount of data to incorporate versus ease of use. You don't want to make the model so complex that it takes hours of time to populate the variables, but you also want to make sure you incorporate enough to be useful. This just takes experience and tinkering. Given the amount of statistics in baseball, 100 people will create 100 different models (and two different models can both be useful). Good luck!
                        Comment
                        SBR Contests
                        Collapse
                        Top-Rated US Sportsbooks
                        Collapse
                        Working...