Calculating Home Field/Court/Ice Advantage on a Per Team Basis...
I am tinkering with developing a linear regression model for a competitive league using the first part of 2009-2010 season data. I am not to the point of validating the model yet, just coming up with the format and code to manipulate the data, scrapers, macros, parsing etc.
I am keeping it somwhat simple with respect to the number of independent variables just to get the format and code to work. Then I plan to go back and add more variables/complexity to it, and then evaluate its predictive value via back testing on 2004 - 2008.
The offensive and defensive ratings are calculated first without distinguishing between Home/Road games, to maximize the data set concerning number of events. For simplicity's sake I haven't implemented a diminishing returns function for MOV or anything like that, etc. Just the bare bones ratings for each team's offense and defense.
To illustrate this for the example's purpose, I chose the NHL and here are my offense and defense ratings for each team:
Team Off Def HA
--------------------------------------------------------------------
1 Anaheim 2.7763 0.3256 0.9256
2 Atlanta 2.9639 0.7285 -0.1803
3 Boston 2.4277 -0.2488 -0.1584
4 Buffalo 2.7172 -0.3814 0.1494
5 Calgary 2.7099 -0.2675 -0.935
6 Carolina 2.4296 0.6584 1.131
7 Chicago 3.1716 -0.4488 0.1613
8 Colorado 2.83 0.1111 0.3268
9 Columbus 2.7106 0.4654 0.8182
10 Dallas 2.6815 0.3616 0.5443
11 Detroit 2.562 -0.2299 0.3591
12 Edmonton 2.8043 0.4656 -0.1465
13 Florida 2.624 0.2413 -0.6731
14 Los_Angeles 2.9986 0.0684 0.4598
15 Minnesota 2.6031 0.1433 0.721
16 Montreal 2.3647 -0.0888 0.1815
17 Nashville 2.8597 0.0121 -0.8245
18 New_Jersey 2.7529 -0.6371 -0.4804
19 NY_Islanders 2.4822 0.3683 0.1614
20 NY_Rangers 2.6199 -0.0481 -0.167
21 Ottawa 2.5862 0.2275 0.9764
22 Philadelphia 2.8919 0.1247 0.3491
23 Phoenix 2.564 -0.2973 0.4109
24 Pittsburgh 2.9931 0.2043 2.0231
25 San_Jose 3.1502 -0.1815 0.4196
26 St_Louis 2.68 0.0124 -1.5679
27 Tampa_Bay 2.4449 0.1829 1.2067
28 Toronto 2.8097 0.6841 0.2454
29 Vancouver 3.1678 -0.3977 0.9423
30 Washington 3.5746 0 -0.0095
So to predict the final score of tonight's game between Vancouver and Minnesota...
Off Def HA Predicted Score
------------------------------------------------------------------
Vancouver 3.168 -0.398 0.942 2.840
Minnesota 2.603 0.143 0.721 2.566
So to get the score for Vancouver-----> Score = VanOff + MinDef - (VanHA)/2
So to get the score for Minnesota-----> Score = MinOff + VanDef + (MinHA)/2
So Vancouver's probability to win the game is estimated like this
2.840 / (2.840 + 2.566) = 0.525 and corresponds to a fair moneyline of -111 for Vancouver
It seems to resemble the market price of -115. Most of my other games are pretty close to the line as well, but some are off a bit. That is where the backtesting will
eventually come in.
My question is in the method that I used to calculate Home Ice Advantage. I calculated separate home and away, offensive and defensive ratings for each team. Then took the difference between them to represent the Home versus Away Ice Advantage.
For example if a team was rated say, 0.4 goals better on offense at home and say, 0.3 goals better on defense then that team's HA would be 0.7.
So since Vancouver will be away and Minnesota will be home you subtract half of Vancouver's home advantage from it's projected score and add half of minnesota's to its projected score.
Does anyone have a different way to calculate and implement independent Home Advantages or does this seem reasonable?
There really aren't that many NHL models to compare to, although I found one with somewhat similar numbers.
Again, this sport is just an example, since the number of teams is limited and the numbers are not very cumbersome to work with, as I get the automation process going.
I just wanted to run it by some folks to see if they agree/disagree or do it a different way...
Thanks,
Miz
I am tinkering with developing a linear regression model for a competitive league using the first part of 2009-2010 season data. I am not to the point of validating the model yet, just coming up with the format and code to manipulate the data, scrapers, macros, parsing etc.
I am keeping it somwhat simple with respect to the number of independent variables just to get the format and code to work. Then I plan to go back and add more variables/complexity to it, and then evaluate its predictive value via back testing on 2004 - 2008.
The offensive and defensive ratings are calculated first without distinguishing between Home/Road games, to maximize the data set concerning number of events. For simplicity's sake I haven't implemented a diminishing returns function for MOV or anything like that, etc. Just the bare bones ratings for each team's offense and defense.
To illustrate this for the example's purpose, I chose the NHL and here are my offense and defense ratings for each team:
Team Off Def HA
--------------------------------------------------------------------
1 Anaheim 2.7763 0.3256 0.9256
2 Atlanta 2.9639 0.7285 -0.1803
3 Boston 2.4277 -0.2488 -0.1584
4 Buffalo 2.7172 -0.3814 0.1494
5 Calgary 2.7099 -0.2675 -0.935
6 Carolina 2.4296 0.6584 1.131
7 Chicago 3.1716 -0.4488 0.1613
8 Colorado 2.83 0.1111 0.3268
9 Columbus 2.7106 0.4654 0.8182
10 Dallas 2.6815 0.3616 0.5443
11 Detroit 2.562 -0.2299 0.3591
12 Edmonton 2.8043 0.4656 -0.1465
13 Florida 2.624 0.2413 -0.6731
14 Los_Angeles 2.9986 0.0684 0.4598
15 Minnesota 2.6031 0.1433 0.721
16 Montreal 2.3647 -0.0888 0.1815
17 Nashville 2.8597 0.0121 -0.8245
18 New_Jersey 2.7529 -0.6371 -0.4804
19 NY_Islanders 2.4822 0.3683 0.1614
20 NY_Rangers 2.6199 -0.0481 -0.167
21 Ottawa 2.5862 0.2275 0.9764
22 Philadelphia 2.8919 0.1247 0.3491
23 Phoenix 2.564 -0.2973 0.4109
24 Pittsburgh 2.9931 0.2043 2.0231
25 San_Jose 3.1502 -0.1815 0.4196
26 St_Louis 2.68 0.0124 -1.5679
27 Tampa_Bay 2.4449 0.1829 1.2067
28 Toronto 2.8097 0.6841 0.2454
29 Vancouver 3.1678 -0.3977 0.9423
30 Washington 3.5746 0 -0.0095
So to predict the final score of tonight's game between Vancouver and Minnesota...
Off Def HA Predicted Score
------------------------------------------------------------------
Vancouver 3.168 -0.398 0.942 2.840
Minnesota 2.603 0.143 0.721 2.566
So to get the score for Vancouver-----> Score = VanOff + MinDef - (VanHA)/2
So to get the score for Minnesota-----> Score = MinOff + VanDef + (MinHA)/2
So Vancouver's probability to win the game is estimated like this
2.840 / (2.840 + 2.566) = 0.525 and corresponds to a fair moneyline of -111 for Vancouver
It seems to resemble the market price of -115. Most of my other games are pretty close to the line as well, but some are off a bit. That is where the backtesting will
eventually come in.
My question is in the method that I used to calculate Home Ice Advantage. I calculated separate home and away, offensive and defensive ratings for each team. Then took the difference between them to represent the Home versus Away Ice Advantage.
For example if a team was rated say, 0.4 goals better on offense at home and say, 0.3 goals better on defense then that team's HA would be 0.7.
So since Vancouver will be away and Minnesota will be home you subtract half of Vancouver's home advantage from it's projected score and add half of minnesota's to its projected score.
Does anyone have a different way to calculate and implement independent Home Advantages or does this seem reasonable?
There really aren't that many NHL models to compare to, although I found one with somewhat similar numbers.
Again, this sport is just an example, since the number of teams is limited and the numbers are not very cumbersome to work with, as I get the automation process going.
I just wanted to run it by some folks to see if they agree/disagree or do it a different way...
Thanks,
Miz