Models have different variables in them, I'm wondering if what variables work for one year work well for next year. I personally use different percentage weightings for all my variables. For those that do the same, how do they differ from year to year if you tweak them for each year? Example I use variables X Y Z, at 33.33% each. Now pretend it worked well last season, will a new season maybe have to be changed to 40-20-40% to work best? How dramatic a change should I anticipate year over year?
To those with MLB models....
Collapse
X
-
gamblingisfunSBR Sharp
- 08-14-10
- 401
#1To those with MLB models....Tags: None -
Waterstpub87SBR MVP
- 09-09-09
- 4102
#2If you have an actual accurate model, it should be the same year to year. Unless the fundamental rules of baseball change, your variables shouldn't change. If you find that you have to shift weightings year to year, I don't think you are accurately representing the underlying win probabilities of teams.Comment -
gamblingisfunSBR Sharp
- 08-14-10
- 401
#3I've only used my model for one year last season, this is year two. I'm just hoping the weightings don't change.Comment -
InspiritedSBR MVP
- 06-26-10
- 1788
#4how did you come up with your weightings?Comment -
mebaranSBR MVP
- 09-16-09
- 1540
#5Things change every year. Organizations move in their outfield fences, add seats to subtract foul territory, etc. Macro events happen like crackdowns in steroid use that COMPLETELY change the game. It's no coincidence that league hitting stats have gone down consistently the last 4 or 5 years..
Your model has to change from season to season, and mid-season as well.Comment -
matthew919SBR Sharp
- 11-21-12
- 421
#6I'm going to side with Waters on this one. A game-changing effect on park factor is rare- my feeling is that if you suspect something fishy is going on at a park, where the current year deviates drastically from your 3 or 5 year park factor, you should investigate it manually, and update your model accordingly (that is, if you can convince yourself it's real). But I claim these cases will be exceedingly rare.
As far as the steroids argument? That's meaningless. A guy who hits 50 HR a year on steroids should perform just as well as a guy who hits 50 HR a year NOT on steroids. A good model will not need to infer what chemical cocktails might be present in a hitters bloodstream, any more than it should care what the hitter ate for breakfast. The one caveat is when a guy who was ON roids goes OFF (or vice versa, I suppose)- and therefore his historical stats are inflated (or deflated) relative to the current expectation. Of course, these cases are impossible to identify computationally, so I would just use with stats no more than a year old when modeling. Which is advisable anyway, based on career trajectory and whatnot.
Bottom line- more data should have the effect of refining your model, not completely changing it.Comment -
mebaranSBR MVP
- 09-16-09
- 1540
#7I'm going to side with Waters on this one. A game-changing effect on park factor is rare- my feeling is that if you suspect something fishy is going on at a park, where the current year deviates drastically from your 3 or 5 year park factor, you should investigate it manually, and update your model accordingly (that is, if you can convince yourself it's real). But I claim these cases will be exceedingly rare.
As far as the steroids argument? That's meaningless. A guy who hits 50 HR a year on steroids should perform just as well as a guy who hits 50 HR a year NOT on steroids. A good model will not need to infer what chemical cocktails might be present in a hitters bloodstream, any more than it should care what the hitter ate for breakfast. The one caveat is when a guy who was ON roids goes OFF (or vice versa, I suppose)- and therefore his historical stats are inflated (or deflated) relative to the current expectation. Of course, these cases are impossible to identify computationally, so I would just use with stats no more than a year old when modeling. Which is advisable anyway, based on career trajectory and whatnot.
Bottom line- more data should have the effect of refining your model, not completely changing it.Comment -
matthew919SBR Sharp
- 11-21-12
- 421
#8I think there's a big distinction between player performance and the dynamics of the game. The first will fluctuate, the second will not. So in that sense, no, I don't believe the game has changed at all in the past 10 years.
But that's just my approach, which is not to say you can't build a successful model which operates under a completely different belief.Comment -
EXhoosier10SBR MVP
- 07-06-09
- 3122
#9If one were to model games, different run scoring environments should absolutely change your model. If Mcgwire, Sosa, and the like can hit a homerun off of anybody on PEds, hitters are going to play a much larger role in your model than if they stop and all of the sudden suck against good pitchers but can still go deep on bad ones.
In reply to this
As far as the steroids argument? That's meaningless. A guy who hits 50 HR a year on steroids should perform just as well as a guy who hits 50 HR a year NOT on steroids.
Outside of steroids, park factors change/become more reliable every year, so having more data should change the % you weight PF in newer parks. Pitchers being able to dominate hitters more and more (for whatever reason -- specialized bullpens, stricter pitch counts, etc) should at least change your model by small amounts every year.Comment -
myconSBR Rookie
- 04-13-11
- 29
#10If you have to change your weighting all the time, you are probably backfitting to a degree.Comment -
Waterstpub87SBR MVP
- 09-09-09
- 4102
#11If one were to model games, different run scoring environments should absolutely change your model. If Mcgwire, Sosa, and the like can hit a homerun off of anybody on PEds, hitters are going to play a much larger role in your model than if they stop and all of the sudden suck against good pitchers but can still go deep on bad ones.
In reply to this
, 50 HR guys on steroids don't really exist as 50HR guys not on steroids. If you don't buy 50 hr being the cutoff, 60 and 70 hr guys surely would suffice.
Outside of steroids, park factors change/become more reliable every year, so having more data should change the % you weight PF in newer parks. Pitchers being able to dominate hitters more and more (for whatever reason -- specialized bullpens, stricter pitch counts, etc) should at least change your model by small amounts every year.
When I think modeling, I think "What is the best formula that gives me the closest implied probabilities to actual game results". One formula that will allow me to be able to forecast prices which I can then compare to the marketplace as a whole. Certainly hitting is a variable in there, that I would use a combination of statistics to get to. So if this hitting component increase, it will change my pricing and therefore have an effect on the output.
So it is not as much as "Hitting is more important now" but more "Hitting has the effect of changing price by x times .30". So if the hitting stat increase, it will effect price by a larger degree, but not the formula if you will.
This is just my approach, but I certainly enjoy the discussion and learning different perspectives. I'm somewhat handicapped by the fact that I never played baseball, and I sometimes don't really understand the flow the game. I have a tendency to just think of things as strings of number.Comment -
mebaranSBR MVP
- 09-16-09
- 1540
#12^There we go. Yeah, run modelling should fundamentally remain the same. Take Pythagorean, for example, and tweak exponents as necessary.Comment -
gamblingisfunSBR Sharp
- 08-14-10
- 401
#13I personally have like 16 variables in my model. I just created it last year, so there was almost daily tweaking in April and the first part of may until I found the correct variable weightings to make it all work the best possible. I have everything linked together on my model spreadsheet so if I change one variable it could change who I bet on, how much I bet, basically could change a win to a loss. Since I had it all linked together, I just ran it daily and collected my data and changed my variables to fit best to come to the best units won/winning percentage. I stopped messing with my variables in early may because I think I had came up with the right ones and it worked the rest of the season. So basically I'm hoping that I don't have to do much tweaking this year to it in the beginning. I could theoretically base my model 100% on how a reliever did yesterday or how a hitter did last month only lol, and I'd go with it if that's what gave me the best results. But what I came up with utilizes all my variables to some degree, so it makes sense in an actual game context.Comment -
BrebosSBR MVP
- 02-24-13
- 1209
#14I'm going to side with Waters on this one. A game-changing effect on park factor is rare- my feeling is that if you suspect something fishy is going on at a park, where the current year deviates drastically from your 3 or 5 year park factor, you should investigate it manually, and update your model accordingly (that is, if you can convince yourself it's real). But I claim these cases will be exceedingly rare.
As far as the steroids argument? That's meaningless. A guy who hits 50 HR a year on steroids should perform just as well as a guy who hits 50 HR a year NOT on steroids. A good model will not need to infer what chemical cocktails might be present in a hitters bloodstream, any more than it should care what the hitter ate for breakfast. The one caveat is when a guy who was ON roids goes OFF (or vice versa, I suppose)- and therefore his historical stats are inflated (or deflated) relative to the current expectation. Of course, these cases are impossible to identify computationally, so I would just use with stats no more than a year old when modeling. Which is advisable anyway, based on career trajectory and whatnot.
Bottom line- more data should have the effect of refining your model, not completely changing it.. Anyone who says otherwise is living in a fantasy world.
Comment
SBR Contests
Collapse
Top-Rated US Sportsbooks
Collapse
#1 BetMGM
4.8/5 BetMGM Bonus Code
#2 FanDuel
4.8/5 FanDuel Promo Code
#3 Caesars
4.8/5 Caesars Promo Code
#4 DraftKings
4.7/5 DraftKings Promo Code
#5 Fanatics
#6 bet365
4.7/5 bet365 Bonus Code
#7 Hard Rock
4.1/5 Hard Rock Bet Promo Code
#8 BetRivers
4.1/5 BetRivers Bonus Code