1. #1
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,044
    Betpoints: 7298

    Normalizing MLB offense to stadiums

    Working on a model with an input for Park Factor. Looking at the Park Factors, there seems to be a high level of variability from year to year in the same park. For example, in 2008 and 2009 AT&T Run effects were 1.045, and 1.05, favoring the hitter, and 2010-2013 .869, .737, .737, .942, favoring the defense. There are several other examples of similar behavior. Could this be a case of weather that year, pitchers pitching significantly better, or crowd effects?

    Anyone else ever look at this

  2. #2
    evo34
    evo34's Avatar Become A Pro!
    Join Date: 11-09-08
    Posts: 1,032
    Betpoints: 4198

    Quote Originally Posted by Waterstpub87 View Post
    Working on a model with an input for Park Factor. Looking at the Park Factors, there seems to be a high level of variability from year to year in the same park. For example, in 2008 and 2009 AT&T Run effects were 1.045, and 1.05, favoring the hitter, and 2010-2013 .869, .737, .737, .942, favoring the defense. There are several other examples of similar behavior. Could this be a case of weather that year, pitchers pitching significantly better, or crowd effects?

    Anyone else ever look at this
    Some weather; mostly variance. Runs are variable enough that one season is not going to give you enough data to figure out a park factor.

  3. #3
    EXhoosier10
    EXhoosier10's Avatar Become A Pro!
    Join Date: 07-06-09
    Posts: 3,122
    Betpoints: 4390

    run a regression with the last 1, 2, 3, 4, 5, etc years of data trying to predict the next years data and use whichever regression gives you the highest correlation or lowest RMSE

  4. #4
    Squared Box
    Squared Box's Avatar Become A Pro!
    Join Date: 04-19-07
    Posts: 91
    Betpoints: 3676

    5 years of data is best.

Top