Math Model: how do you compare teams who played different difficulty opponents??
I always like to include examples of what i'm talking about and this one is best described as one. I just want the BEST math behind this and the right way to do it. I have tried about 4 different methods and i have my favorites but i want to know how you guys would do it. This is just something small that i need to zero in on for accuracy.
Answer this question and how do you explain how to do it.
If Team A's offense averages 78.2 points against teams that give up on average 71.96667 and Team B's defense gives up 66.1 points against teams that put up on average 75.0875 then what is a good estimation about how many points Team A will put up based on the given information?
Team A 78.2 71.9667
Team B 66.1 75.0875
The problem i'm coming up with is with the denominator. If team A difference is +6.233 and Team B difference is -8.98 then i come up with -2.75 against both teams opponents but which denominator (opp) do i add this to? Do i take the average of the denominators and then add it to get 70.78 points for Team A?
Another method is this: Team A averages 78.2 against teams that give up on average 71.9667 but Team B doesn't give up 71.9667 they give up 66.1 so Team A off will have 71.825. On the other side, Team B gives up on average 66.1 against teams that average 75.0875 but Team A doesn't put up 75.0875 they put up 78.2 so Team B def. will give up 68.84. Then take the average of 68.84 and 71.825.
Obviously there IS a distinct difference in the difficulty of the opponents and there is a difference between 68.84 points and 71.825 points in the 2nd example. Should i just stick to games in which the opponents difficulty is fairly the same? Divisonal games in NCAAB, and NBA mid-season? NBA isn't that much of a problem because the teams play all of each other. The denominators are fairly close which leads to less error but IMO NBA lines are a lot sharper than college lines. College is where the problem lies. Take a team like Davidson playing Kansas. Slight variations in opponents difficulty.
Another problem between these 2 ways that i found is when two teams both like to run. Say you have Phoenix playing Denver. If you use the 2nd example method you are taking the averages which leads to a false total. Say team A is +6 and team B defense is +4 then you would have +10 to the teams they've played against. In example 2 the average would be taken between the two numbers and a false total would come out. So instead of having a 117-112 score (method 2) you should really have a 124-119 (method 1) type score. But like i asked before, which denominator do I add the numbers to? So I more or less just need help with the first example because the 2nd gives false totals....sorry for all the info just thinking on paper.
Just throwing out some ideas, i have various others that tweak this....what would you guys do?
I didn't read the entire message b/c it's too long, but based on the first paragraph and the title of this thread you are interested in controlling for the quality of competition.
I do not think the way you are considering this issue by looking at point differential is very valid but some might argue that point.
Well, basically, you would need to develop a variable that measures for opponent's strength of schedule and treat that as a stand alone primary variable in your model or you could simply treat it as a control variable...it would be up to you how you wanted to use it but I certainly agree that it is important.
IMO, this is an extremely difficult variable to construct because there are several validity issues that need to be considered before doing this. The biggest obstacle with this variable is that it is ALWAYS varying and the earlier in a season the more volatile it is going to be. Really, it won't become stagnant until you get deep into the season and everyone has played everyone.
There are a number of reputable websites that offer strength of schedule measures that you can consider. It will save you a ton of time.
He's particularly correct that without prior knowledge of the sport in question (in other words, additional data and/or theory), point estimates of scoring expectations are by themselves generally insufficient to provide the answer for which you're looking.
Off-the-cuff I'd say that to really go about this theoretically correctly ("BEST math", as you say) you'd first want to model the pdf of each team's scoring (for or against as applicable) conditioned on some measure of its opponent's performance. You'd further want to model a marginal (i.e., a "prior") probability distribution for each teams' (offensive or defensive) performance (assuming a one-to-one relationship existed between a team's marginal performance and it's scoring). Finally you'd utilize Bayesian inference to combine all your estimates in order to determine a posterior pdf for scoring by team A conditioned on the joint realization of your predetermined performance measures. By integrating across all possible performance measure realizations you'd then be able to calculate a final expectation. This would be what I'd consider a serious project, with likely considerably overreaching complexity.
If, however, you'd settle for roughly estimating Team A's win probability vs. Team B, then there is a straightforward (if slightly involved) methodology that may be employed to this end. Still necessary, however, is the assumption that the opponents of Team A and B have themselves played against directly relatable opposition (consider that there'd be no way one would be able to draw meaningful conclusions about the Knicks' likely performance against a Division III team based solely on the two teams' results within their respective leagues).
I'll warn you, however, that the results at best represent a first-order approximation far better suited to Rotisserie League play than to professional advantage betting. Nevertheless, the general techniques involved may be instructive and could potentially prove useful within alternate contexts.
Given by OP:
Team A's points for = 78.2
Team A's opps' points against = 71.96667
Team B's points against = 66.1
Team B's opps' points for = 75.0875
Let's add a few more variables to the problem specification, namely:
Team A's points against = w
Team A's opps' points for = x
Team B's points for = y
Team B's opps' points against = z
Baseball's Bill James posited that the number of wins expected by a baseball team over some stretch of games could be reasonably estimated as a function solely of total runs scored and total runs allowed. This is known as the "Pythagorean expectation" and is given by the following formula:
Expected MLB Wins = Games Played * RunsFor^n/(RunsFor^n + RunsAgainst^n)
where n is a data-determined exponent equal to roughly 2 for Major League Baseball.
It's certainly possible (although not necessarily advisable) to apply this same methodology to other sports. I'm going to demonstrate an extremely simple exponent estimation technique for the NBA. Disinterested readers won't lose much by skipping ahead to the conclusion.
Using covers.com data one can derive the following data set through the 2006-07 season (also see Sheet1 of the attached spreadsheet):
Team
Points For
Points Against
Wins
Games
ATLANTA1991
9,489
9,466
45
87
ATLANTA1992
8,711
8,834
38
82
ATLANTA1993
9,101
9,214
43
85
ATLANTA1994
9,321
8,855
63
93
ATLANTA1995
8,189
8,116
42
85
ATLANTA1996
8,315
8,262
46
85
ATLANTA1997
8,204
7,813
56
87
ATLANTA1998
8,210
7,916
51
86
ATLANTA1999
5,032
4,912
34
59
ATLANTA2000
7,735
8,176
28
82
ATLANTA2001
7,459
7,886
25
82
ATLANTA2002
7,711
8,058
33
82
ATLANTA2003
7,714
8,006
35
82
ATLANTA2004
7,611
7,992
28
82
ATLANTA2005
7,605
8,401
13
82
ATLANTA2006
7,972
8,362
26
82
ATLANTA2007
7,680
8,070
30
82
BOSTON1991
10,359
9,869
61
93
BOSTON1992
9,816
9,518
57
92
BOSTON1993
8,904
8,852
49
86
BOSTON1994
8,267
8,618
32
82
BOSTON1995
8,773
8,975
36
86
BOSTON1996
8,002
8,258
30
77
BOSTON1997
7,941
8,567
13
79
BOSTON1998
7,864
8,079
36
82
BOSTON1999
4,650
4,743
19
50
BOSTON2000
8,146
8,208
35
82
BOSTON2001
7,759
7,934
36
82
BOSTON2002
9,361
9,135
58
98
BOSTON2003
8,545
8,583
48
92
BOSTON2004
8,149
8,335
36
86
BOSTON2005
8,918
8,851
48
89
BOSTON2006
8,033
8,159
33
82
BOSTON2007
7,857
8,137
24
82
CHARLOTTE2005
7,729
8,220
18
82
CHARLOTTE2006
7,943
8,270
26
82
CHARLOTTE2007
7,945
8,252
33
82
CHICAGO1991
10,791
9,847
76
99
CHICAGO1992
11,219
10,227
82
104
CHICAGO1993
10,570
9,943
72
101
CHICAGO1994
8,962
8,696
60
92
CHICAGO1995
9,306
8,916
52
92
CHICAGO1996
9,605
8,457
82
92
CHICAGO1997
9,689
8,702
80
95
CHICAGO1998
9,887
9,157
77
103
CHICAGO1999
4,095
4,568
13
50
CHICAGO2000
6,952
7,723
17
82
CHICAGO2001
7,181
7,927
15
82
CHICAGO2002
7,335
8,035
21
82
CHICAGO2003
7,786
8,207
30
82
CHICAGO2004
7,355
7,876
23
82
CHICAGO2005
8,360
8,284
49
88
CHICAGO2006
8,610
8,576
43
88
CHICAGO2007
9,024
8,602
55
92
CLEVELAND1991
8,343
8,545
33
82
CLEVELAND1992
10,687
10,197
66
99
CLEVELAND1993
9,675
9,165
57
91
CLEVELAND1994
8,580
8,270
47
85
CLEVELAND1995
7,747
7,729
44
86
CLEVELAND1996
6,718
6,598
40
74
CLEVELAND1997
6,867
6,713
40
78
CLEVELAND1998
7,908
7,716
48
86
CLEVELAND1999
4,322
4,408
22
50
CLEVELAND2000
7,950
8,237
32
82
CLEVELAND2001
7,561
7,909
30
82
CLEVELAND2002
7,812
8,085
29
82
CLEVELAND2003
7,495
8,284
17
82
CLEVELAND2004
7,619
7,834
35
82
CLEVELAND2005
7,914
7,849
42
82
CLEVELAND2006
9,177
9,035
57
95
CLEVELAND2007
9,710
9,353
62
102
DALLAS1991
8,195
8,570
28
82
DALLAS1992
8,007
8,634
22
82
DALLAS1993
8,141
9,387
11
82
DALLAS1994
7,811
8,514
13
82
DALLAS1995
8,463
8,700
36
82
DALLAS1996
5,632
5,846
19
55
DALLAS1997
6,921
7,374
23
76
DALLAS1998
7,494
7,995
20
82
DALLAS1999
4,581
4,701
19
50
DALLAS2000
8,316
8,363
40
82
DALLAS2001
9,161
8,847
57
92
DALLAS2002
9,496
9,142
61
90
DALLAS2003
10,526
9,898
70
102
DALLAS2004
9,124
8,753
53
87
DALLAS2005
9,772
9,322
64
95
DALLAS2006
10,424
9,832
74
105
DALLAS2007
8,792
8,240
69
88
DENVER1991
9,828
10,723
20
82
DENVER1992
8,176
8,821
24
82
DENVER1993
8,626
8,769
36
82
DENVER1994
9,369
9,263
48
94
DENVER1995
8,588
8,565
41
85
DENVER1996
6,639
6,841
29
68
DENVER1997
7,439
7,906
20
76
DENVER1998
7,300
8,266
11
82
DENVER1999
4,674
5,004
14
50
DENVER2000
8,115
8,289
35
82
DENVER2001
7,918
8,120
40
82
DENVER2002
7,559
8,036
27
82
DENVER2003
6,900
7,580
17
82
DENVER2004
8,425
8,356
44
87
DENVER2005
8,612
8,497
50
87
DENVER2006
8,664
8,683
45
87
DENVER2007
9,080
8,977
46
87
DETROIT1991
9,721
9,470
57
97
DETROIT1992
8,439
8,408
49
87
DETROIT1993
8,252
8,366
40
82
DETROIT1994
7,949
8,587
20
82
DETROIT1995
8,053
8,651
28
82
DETROIT1996
7,671
7,503
44
80
DETROIT1997
7,719
7,324
52
82
DETROIT1998
7,721
7,592
37
82
DETROIT1999
4,914
4,758
31
55
DETROIT2000
8,722
8,635
42
85
DETROIT2001
7,837
7,976
32
82
DETROIT2002
8,563
8,390
54
92
DETROIT2003
9,021
8,702
58
99
DETROIT2004
9,391
8,765
70
105
DETROIT2005
9,911
9,476
69
107
DETROIT2006
9,591
8,987
74
100
DETROIT2007
9,334
8,946
63
98
GOLDEN STATE1991
10,594
10,473
48
91
GOLDEN STATE1992
10,200
9,878
56
86
GOLDEN STATE1993
9,007
9,071
34
82
GOLDEN STATE1994
9,192
9,069
50
85
GOLDEN STATE1995
8,661
9,122
26
82
GOLDEN STATE1996
7,653
7,759
34
75
GOLDEN STATE1997
7,794
8,140
29
78
GOLDEN STATE1998
7,237
7,985
19
82
GOLDEN STATE1999
4,416
4,541
21
50
GOLDEN STATE2000
7,834
8,512
19
82
GOLDEN STATE2001
7,584
8,326
17
82
GOLDEN STATE2002
8,009
8,452
21
82
GOLDEN STATE2003
8,400
8,493
38
82
GOLDEN STATE2004
7,649
7,709
37
82
GOLDEN STATE2005
8,094
8,271
34
82
GOLDEN STATE2006
8,076
8,187
34
82
GOLDEN STATE2007
9,910
9,919
47
93
HOUSTON1991
9,033
8,763
52
85
HOUSTON1992
8,366
8,507
42
82
HOUSTON1993
9,704
9,339
61
94
HOUSTON1994
9,943
9,485
70
98
HOUSTON1995
10,389
10,182
58
100
HOUSTON1996
7,305
7,145
44
72
HOUSTON1997
9,068
8,711
62
90
HOUSTON1998
8,522
8,618
43
87
HOUSTON1999
5,099
4,992
32
54
HOUSTON2000
8,156
8,227
34
82
HOUSTON2001
7,972
7,784
45
82
HOUSTON2002
7,572
7,973
28
82
HOUSTON2003
7,689
7,567
43
82
HOUSTON2004
7,785
7,670
46
87
HOUSTON2005
8,479
8,167
54
89
HOUSTON2006
7,387
7,517
34
82
HOUSTON2007
8,564
8,188
55
89
INDIANA1991
9,751
9,785
43
87
INDIANA1992
9,520
9,387
40
85
INDIANA1993
9,247
9,107
42
86
INDIANA1994
9,701
9,404
56
98
INDIANA1995
9,836
9,467
63
99
INDIANA1996
7,685
7,425
49
77
INDIANA1997
7,487
7,467
36
79
INDIANA1998
9,342
8,807
68
98
INDIANA1999
5,950
5,716
42
63
INDIANA2000
10,562
10,124
69
105
INDIANA2001
7,940
7,981
42
86
INDIANA2002
8,396
8,373
44
87
INDIANA2003
8,487
8,235
50
88
INDIANA2004
8,861
8,318
71
98
INDIANA2005
8,718
8,694
50
95
INDIANA2006
8,235
8,102
43
88
INDIANA2007
7,840
8,040
35
82
L.A. CLIPPERS1991
8,491
8,774
31
82
L.A. CLIPPERS1992
8,931
8,863
47
87
L.A. CLIPPERS1993
9,220
9,232
43
87
L.A. CLIPPERS1994
8,447
8,916
27
82
L.A. CLIPPERS1995
7,937
8,678
17
82
L.A. CLIPPERS1996
7,463
7,754
26
75
L.A. CLIPPERS1997
7,866
8,150
32
81
L.A. CLIPPERS1998
7,865
8,469
17
82
L.A. CLIPPERS1999
4,519
4,960
9
50
L.A. CLIPPERS2000
7,546
8,491
15
82
L.A. CLIPPERS2001
7,581
7,818
31
82
L.A. CLIPPERS2002
7,844
7,884
39
82
L.A. CLIPPERS2003
7,693
8,032
27
82
L.A. CLIPPERS2004
7,770
8,147
28
82
L.A. CLIPPERS2005
7,849
7,912
37
82
L.A. CLIPPERS2006
9,238
9,064
54
94
L.A. CLIPPERS2007
7,843
7,881
40
82
L.A. LAKERS1991
10,691
10,117
70
101
L.A. LAKERS1992
8,607
8,756
44
86
L.A. LAKERS1993
9,041
9,154
41
87
L.A. LAKERS1994
8,233
8,585
33
82
L.A. LAKERS1995
9,521
9,591
53
92
L.A. LAKERS1996
7,796
7,526
46
76
L.A. LAKERS1997
8,024
7,760
52
81
L.A. LAKERS1998
9,955
9,304
68
95
L.A. LAKERS1999
5,702
5,574
34
58
L.A. LAKERS2000
10,562
9,807
82
105
L.A. LAKERS2001
9,905
9,424
71
98
L.A. LAKERS2002
10,161
9,507
73
101
L.A. LAKERS2003
9,433
9,239
56
94
L.A. LAKERS2004
9,991
9,651
69
104
L.A. LAKERS2005
8,095
8,338
34
82
L.A. LAKERS2006
8,858
8,700
48
89
L.A. LAKERS2007
8,964
9,022
43
87
MEMPHIS1996
6,515
7,207
13
72
MEMPHIS1997
6,667
7,474
12
75
MEMPHIS1998
7,923
8,522
19
82
MEMPHIS1999
4,443
4,876
8
50
MEMPHIS2000
7,702
8,163
22
82
MEMPHIS2001
7,522
7,992
23
82
MEMPHIS2002
7,366
7,978
23
82
MEMPHIS2003
7,995
8,260
28
82
MEMPHIS2004
8,264
8,120
50
86
MEMPHIS2005
8,072
7,928
45
86
MEMPHIS2006
7,895
7,648
49
86
MEMPHIS2007
8,331
8,753
22
82
MIAMI1991
8,349
8,840
24
82
MIAMI1992
8,906
9,305
38
85
MIAMI1993
8,495
8,599
36
82
MIAMI1994
8,924
8,739
44
87
MIAMI1995
8,293
8,428
32
82
MIAMI1996
7,051
6,954
36
73
MIAMI1997
8,773
8,314
66
94
MIAMI1998
8,224
7,831
57
87
MIAMI1999
4,844
4,616
35
55
MIAMI2000
8,571
8,291
58
92
MIAMI2001
7,524
7,403
50
85
MIAMI2002
7,150
7,274
36
82
MIAMI2003
7,016
7,430
25
82
MIAMI2004
8,495
8,450
48
95
MIAMI2005
9,797
9,198
70
97
MIAMI2006
10,405
10,001
68
105
MIAMI2007
8,114
8,233
44
86
MILWAUKEE1991
9,029
8,860
48
85
MILWAUKEE1992
8,608
8,749
31
82
MILWAUKEE1993
8,391
8,699
28
82
MILWAUKEE1994
7,949
8,480
20
82
MILWAUKEE1995
8,160
8,497
34
82
MILWAUKEE1996
7,451
7,878
24
78
MILWAUKEE1997
7,142
7,291
29
75
MILWAUKEE1998
7,748
7,905
36
82
MILWAUKEE1999
4,870
4,818
28
53
MILWAUKEE2000
8,780
8,753
44
87
MILWAUKEE2001
9,982
9,639
62
100
MILWAUKEE2002
7,996
8,014
41
82
MILWAUKEE2003
8,744
8,752
44
88
MILWAUKEE2004
8,467
8,442
42
87
MILWAUKEE2005
7,973
8,218
30
82
MILWAUKEE2006
8,508
8,641
41
87
MILWAUKEE2007
8,172
8,531
28
82
MINNESOTA1991
8,169
8,491
29
82
MINNESOTA1992
8,237
8,815
15
82
MINNESOTA1993
8,042
8,684
19
82
MINNESOTA1994
7,930
8,498
20
82
MINNESOTA1995
7,716
8,464
21
82
MINNESOTA1996
6,886
7,332
20
71
MINNESOTA1997
7,703
7,855
38
80
MINNESOTA1998
8,741
8,712
47
87
MINNESOTA1999
4,969
4,975
26
54
MINNESOTA2000
8,420
8,221
51
86
MINNESOTA2001
8,310
8,225
48
86
MINNESOTA2002
8,451
8,206
50
85
MINNESOTA2003
8,648
8,516
53
88
MINNESOTA2004
9,408
8,959
68
100
MINNESOTA2005
7,934
7,815
44
82
MINNESOTA2006
7,522
7,676
33
82
MINNESOTA2007
7,877
8,178
32
82
NEW JERSEY1991
8,441
8,811
26
82
NEW JERSEY1992
9,048
9,220
41
86
NEW JERSEY1993
8,899
8,819
45
87
NEW JERSEY1994
8,807
8,656
46
86
NEW JERSEY1995
8,042
8,299
30
82
NEW JERSEY1996
6,943
7,249
28
74
NEW JERSEY1997
7,432
7,791
24
76
NEW JERSEY1998
8,461
8,353
43
85
NEW JERSEY1999
4,569
4,758
16
50
NEW JERSEY2000
8,036
8,120
31
82
NEW JERSEY2001
7,552
7,966
26
82
NEW JERSEY2002
9,796
9,456
63
102
NEW JERSEY2003
9,693
9,198
63
102
NEW JERSEY2004
8,371
8,139
54
93
NEW JERSEY2005
7,883
8,057
42
86
NEW JERSEY2006
8,727
8,625
54
93
NEW JERSEY2007
9,083
9,124
47
94
NEW ORLEANS1991
8,428
8,858
26
82
NEW ORLEANS1992
8,980
9,300
31
82
NEW ORLEANS1993
9,930
9,955
48
91
NEW ORLEANS1994
8,732
8,750
41
82
NEW ORLEANS1995
8,619
8,366
51
86
NEW ORLEANS1996
7,517
7,595
36
73
NEW ORLEANS1997
7,634
7,496
50
77
NEW ORLEANS1998
8,668
8,558
55
91
NEW ORLEANS1999
4,644
4,649
26
50
NEW ORLEANS2000
8,437
8,229
50
86
NEW ORLEANS2001
8,496
8,261
52
92
NEW ORLEANS2002
8,565
8,486
48
91
NEW ORLEANS2003
8,256
8,093
49
88
NEW ORLEANS2004
8,092
8,122
44
89
NEW ORLEANS2005
7,252
7,832
18
82
NEW ORLEANS2006
7,611
7,842
38
82
NEW ORLEANS2007
7,833
7,962
39
82
NEW YORK1991
8,713
8,792
39
85
NEW YORK1992
9,411
9,080
57
94
NEW YORK1993
9,795
9,293
69
97
NEW YORK1994
9,695
9,084
69
100
NEW YORK1995
9,081
8,782
61
93
NEW YORK1996
7,589
7,410
46
79
NEW YORK1997
8,265
7,999
60
87
NEW YORK1998
8,395
8,215
47
92
NEW YORK1999
6,020
5,929
39
70
NEW YORK2000
8,906
8,803
59
98
NEW YORK2001
7,720
7,520
50
87
NEW YORK2002
7,514
7,839
30
82
NEW YORK2003
7,860
7,971
37
82
NEW YORK2004
7,878
8,050
39
86
NEW YORK2005
7,977
8,177
33
82
NEW YORK2006
7,842
8,367
23
82
NEW YORK2007
7,994
8,228
33
82
ORLANDO1991
8,684
9,010
31
82
ORLANDO1992
8,330
8,897
21
82
ORLANDO1993
8,653
8,543
41
82
ORLANDO1994
8,941
8,638
50
85
ORLANDO1995
10,790
10,200
67
99
ORLANDO1996
8,538
8,113
61
82
ORLANDO1997
7,577
7,658
42
81
ORLANDO1998
7,387
7,475
41
82
ORLANDO1999
4,818
4,713
34
54
ORLANDO2000
8,206
8,150
41
82
ORLANDO2001
8,403
8,345
44
86
ORLANDO2002
8,615
8,506
45
86
ORLANDO2003
8,690
8,731
45
89
ORLANDO2004
7,711
8,287
21
82
ORLANDO2005
8,160
8,344
36
82
ORLANDO2006
7,784
7,872
36
82
ORLANDO2007
8,123
8,095
40
86
PHILADELPHIA1991
9,448
9,473
48
90
PHILADELPHIA1992
8,358
8,462
35
82
PHILADELPHIA1993
8,556
9,029
26
82
PHILADELPHIA1994
8,033
8,658
25
82
PHILADELPHIA1995
7,820
8,236
24
82
PHILADELPHIA1996
7,450
8,275
16
79
PHILADELPHIA1997
8,131
8,658
22
81
PHILADELPHIA1998
7,651
7,847
31
82
PHILADELPHIA1999
5,197
5,090
31
58
PHILADELPHIA2000
8,713
8,616
54
92
PHILADELPHIA2001
9,887
9,538
68
105
PHILADELPHIA2002
7,906
7,819
45
87
PHILADELPHIA2003
9,046
8,847
54
94
PHILADELPHIA2004
7,215
7,419
33
82
PHILADELPHIA2005
8,582
8,683
44
87
PHILADELPHIA2006
8,147
8,307
38
82
PHILADELPHIA2007
7,785
8,033
35
82
PHOENIX1991
9,731
9,240
56
86
PHOENIX1992
10,142
9,644
57
90
PHOENIX1993
11,813
11,257
75
106
PHOENIX1994
9,923
9,578
61
92
PHOENIX1995
10,183
9,824
65
92
PHOENIX1996
7,681
7,679
38
74
PHOENIX1997
8,623
8,660
39
84
PHOENIX1998
8,538
8,143
57
86
PHOENIX1999
5,056
4,974
27
53
PHOENIX2000
8,897
8,502
57
91
PHOENIX2001
8,064
7,920
52
86
PHOENIX2002
7,802
7,856
36
82
PHOENIX2003
8,345
8,283
46
88
PHOENIX2004
7,723
8,029
29
82
PHOENIX2005
10,734
10,087
71
97
PHOENIX2006
11,030
10,551
64
102
PHOENIX2007
10,182
9,528
67
93
PORTLAND1991
11,069
10,336
72
98
PORTLAND1992
11,444
10,778
70
103
PORTLAND1993
9,297
9,033
52
86
PORTLAND1994
9,210
9,015
48
86
PORTLAND1995
8,752
8,491
44
85
PORTLAND1996
7,102
7,051
36
72
PORTLAND1997
7,909
7,573
49
80
PORTLAND1998
8,133
8,035
47
86
PORTLAND1999
5,862
5,550
42
63
PORTLAND2000
9,470
8,870
69
98
PORTLAND2001
8,091
7,791
50
85
PORTLAND2002
8,191
7,965
49
85
PORTLAND2003
8,511
8,290
53
89
PORTLAND2004
7,439
7,544
41
82
PORTLAND2005
7,621
7,949
27
82
PORTLAND2006
7,285
8,060
21
82
PORTLAND2007
7,717
8,069
32
82
SACRAMENTO1991
7,928
8,484
25
82
SACRAMENTO1992
8,549
9,046
29
82
SACRAMENTO1993
8,847
9,103
25
82
SACRAMENTO1994
8,291
8,764
28
82
SACRAMENTO1995
8,060
8,134
39
82
SACRAMENTO1996
7,031
7,264
31
71
SACRAMENTO1997
7,238
7,508
30
75
SACRAMENTO1998
7,641
8,101
27
82
SACRAMENTO1999
5,462
5,507
29
55
SACRAMENTO2000
9,089
8,890
46
87
SACRAMENTO2001
9,123
8,646
58
90
SACRAMENTO2002
10,193
9,535
71
98
SACRAMENTO2003
9,635
9,076
66
94
SACRAMENTO2004
9,575
9,163
62
94
SACRAMENTO2005
9,016
8,861
51
87
SACRAMENTO2006
8,690
8,621
46
88
SACRAMENTO2007
8,303
8,451
33
82
SAN ANTONIO1991
9,213
8,863
56
86
SAN ANTONIO1992
8,834
8,589
47
85
SAN ANTONIO1993
9,659
9,439
54
92
SAN ANTONIO1994
8,554
8,156
56
86
SAN ANTONIO1995
10,219
9,659
71
97
SAN ANTONIO1996
8,473
8,037
56
83
SAN ANTONIO1997
6,182
6,691
17
67
SAN ANTONIO1998
8,413
8,057
60
91
SAN ANTONIO1999
6,143
5,617
52
67
SAN ANTONIO2000
8,213
7,731
54
86
SAN ANTONIO2001
9,076
8,445
65
95
SAN ANTONIO2002
8,843
8,304
62
92
SAN ANTONIO2003
10,131
9,555
76
106
SAN ANTONIO2004
8,394
7,770
63
92
SAN ANTONIO2005
10,117
9,377
75
105
SAN ANTONIO2006
9,177
8,589
70
95
SAN ANTONIO2007
9,992
9,222
74
102
SEATTLE1991
9,262
9,175
43
87
SEATTLE1992
9,687
9,462
52
91
SEATTLE1993
10,788
10,181
65
101
SEATTLE1994
9,161
8,414
65
87
SEATTLE1995
9,444
8,748
58
86
SEATTLE1996
8,955
8,324
68
87
SEATTLE1997
9,438
8,751
63
93
SEATTLE1998
9,198
8,634
65
92
SEATTLE1999
4,743
4,797
25
50
SEATTLE2000
8,591
8,519
47
87
SEATTLE2001
7,978
7,976
44
82
SEATTLE2002
8,445
8,248
47
87
SEATTLE2003
7,555
7,565
40
82
SEATTLE2004
7,964
8,016
37
82
SEATTLE2005
9,197
9,028
58
93
SEATTLE2006
8,411
8,659
35
82
SEATTLE2007
8,130
8,367
31
82
TORONTO1996
6,995
7,524
17
72
TORONTO1997
7,101
7,310
28
74
TORONTO1998
7,781
8,541
16
82
TORONTO1999
4,557
4,639
23
50
TORONTO2000
8,219
8,244
45
85
TORONTO2001
9,113
8,917
53
94
TORONTO2002
7,913
7,963
44
87
TORONTO2003
7,453
7,934
24
82
TORONTO2004
7,006
7,253
33
82
TORONTO2005
8,178
8,311
33
82
TORONTO2006
8,286
8,532
27
82
TORONTO2007
8,702
8,653
49
88
UTAH1991
9,473
9,180
58
91
UTAH1992
10,523
9,993
64
98
UTAH1993
9,145
8,988
49
87
UTAH1994
9,872
9,500
61
98
UTAH1995
9,246
8,609
62
87
UTAH1996
9,321
8,705
59
92
UTAH1997
9,466
8,721
72
92
UTAH1998
10,058
9,480
75
102
UTAH1999
5,647
5,301
42
61
UTAH2000
8,797
8,480
59
92
UTAH2001
8,407
8,043
55
87
UTAH2002
8,223
8,154
45
86
UTAH2003
8,227
8,084
48
87
UTAH2004
7,271
7,371
42
82
UTAH2005
7,625
7,975
26
82
UTAH2006
7,573
7,789
41
82
UTAH2007
9,986
9,736
60
99
WASHINGTON1991
8,313
8,721
30
82
WASHINGTON1992
8,395
8,761
25
82
WASHINGTON1993
8,353
8,930
22
82
WASHINGTON1994
8,229
8,834
24
82
WASHINGTON1995
8,242
8,701
21
82
WASHINGTON1996
6,770
6,781
28
66
WASHINGTON1997
8,215
8,138
42
83
WASHINGTON1998
7,969
7,921
42
82
WASHINGTON1999
4,560
4,672
18
50
WASHINGTON2000
7,921
8,190
29
82
WASHINGTON2001
7,645
8,192
19
82
WASHINGTON2002
7,609
7,724
37
82
WASHINGTON2003
7,502
7,585
37
82
WASHINGTON2004
7,527
7,990
25
82
WASHINGTON2005
9,245
9,297
49
92
WASHINGTON2006
8,946
8,793
44
88
WASHINGTON2007
8,922
8,999
41
86
The goal is to select an exponent, n, for the modified Pythagorean expectation equation
Expected NBA Wins ≈ Games Played * PointsFor^n/(PointsFor^n + PointsAgainst^n)
that best "fits" the actual data set. You should note that by dividing through by "Games Played" we then have an estimator of the probability of a team winning a particular game given the expected number of points scored for and against.
When attempting to match actual frequencies with predicted expectations, one frequently uses what's known as "logistic modeling". Realize that probabilities are necessarily bounded by 0 and 1 and are inherently nonlinear in nature. This can be intuitively understood by considering the difference between a 100% probability event overvalued at 99.01% (US: -10,000) versus a 50% probability event overvalued at 49.50% (US: +102). Both events are overvalued by 1%, but in the former case a growth-maximizing investor should be willing to bet and borrow every penny possibly available to him, while in the latter case such a bettor would invest less than 1% of his bankroll.
Logistic modeling defines the following function of a probability, p:
logit(p) = ln(p) - ln(1-p)
Another way of saying this is that the logit() of a probability is the logarithm of the "fair" fractional odds (i.e., decimal odds-1) associated with that probability. So in other words, the logit() of an event that occurs exactly half of the time (p=50%) would be 0, the logit() of an event that always occurs (p=100%) would be +∞, and the logit() of an event that never occurs (p=0%) would be -∞.
We then proceed by determining the exponent that minimizes the sum of the squared deviations between the logit of the actual win percentage and logit of the Pythagorean expectation. This is a straightforward problem of convex optimization well within the capabilities of Microsoft Excel Solver. If you look at Sheet1 of the attached spreadsheet you'll see that the exponent value that provides the best fit for the data set is roughly 14.02. Ths is frighteningly close to the value of 14 attributed to Dean Oliver by Wikipedia.
Determination of relative goodness-of-fit for this and other sports are left as an exercise for the interested reader (translation: don't expect me to chime in, but please see this paper if particularly motivated).
So we have:
NBA Win Probability ≈ PointsFor14.02/(PointsFor14.02 + PointsAgainst14.02)
Because the OP wasn't so kind as to provide values for the variables w, x, y, and z I'll arbitrarily assign them as follows:
w = 74
x = 73
y = 68
z = 72
Using the Pythagorean expectation formula above we come up with predicted win probabilities as follows (see Sheet2 of the attached spreadsheet):
Team A: 68.44%
Team A opp: 54.98%
Team B: 59.81%
Team B opp: 64.31%
If we assume that Team A & B's opponents have themselves established their win records against "average" opposition we can then impute Team A and B's respective performances against average opposition. See the following post I had made at another site for a brief discussion of the general methodology:
Originally Posted by Ganchrow
Originally Posted by trytowin
I see on nba.com in the standings section that Boston has a home win percentage of 0.87879 (29-4), and that Utah has a road win percentage of 0.42857 (15-20). The question is, what is the expected probability based on these two numbers only that Boston will win tonight's game?
a) 0.5 - 0.42857 = 0.07143 + 0.87879 ==> 0.95022
b) 0.42857 + 0.87879 / 2 ==> 0.65368
c) (0.87879 - 0.42857) * 2 ==> 0.90044
d) none of the above
I not sure of the correct answer, so please explain. Thx, TTW
The Sabermetrics guys have a methodology for handling this called "log5", but it assumes a league average home team win probability of 50%.
Just thinking about it logically, here's how I'd do it:
Let A = Team A home win % (taken as an unbiased estimator of Boston's "true" home win probability versus an "average" opponent) = 29/33
Let B = Team B road win% (taken as an unbiased estimator of Utah's "true" road win probability versus an "average" opponent) = 15/35
Let H* = expected home win probability between two equally matched teams = 61% (which is roughly the total NBA home win win frequency over the past 25 years)
Let P = expected probability of A beating B playing at A's Home
Define the logit function of a probability p as the log of the "fair" fractional payout odds p/(1-p) so that:
So based on the above data, Boston's win probability over Utah was about 86.073%, implying fair US payout odds of about -618.
(You can also see a preview of the following not-yet-ready-for-prime-time version of a calculator that performs these calculations here. Final version to be released soon.)
logit(A win prob vs. B) = logit(A win prob vs. Avg. Opp) - logit(B win prob vs. Avg. Opp) = 0.974158898 - 0.98630472 = -0.012145822
Inverting the logit function yields what's known as the logistic function:
p = 1 / (1+exp(-logit(p))
giving us a final value for the probability of A defeating B of:
p = 1 / (1+exp(0.012145822)) ≈ 49.70%
(If using the alpha-version calculator linked above, you'd enter the team in question's win probability into the "Home Team" box, 1-opponent's win probability into the "Away Team" box, and 50% into the "League Average" box. This will automatically perform the logit calculations. Comments are welcome.)
If, however, you'd settle for roughly estimating Team A's win probability vs. Team B, then there is a straightforward (if slightly involved) methodology that may be employed to this end. Still necessary, however, is the assumption that the opponents of Team A and B have themselves played against directly relatable opposition (consider that there'd be no way one would be able to draw meaningful conclusions about the Knicks' likely performance against a Division III team based solely on the two teams' results within their respective leagues).
I think that in basketball one can get at least better than meaningless assumptions based on league averages. I would be also interested to hear from european players chiming in on eurocups linemaking.
I think that in basketball one can get at least better than meaningless assumptions based on league averages.
Please explain how this is possible within the context of the provided example (determining fair market line between the Knicks' and a Division III team based solely on the two teams' results within their respective leagues).
Clearly there would first need to be some predefined means of making inter league comparisons. League averages would by themselves be wholly insufficient.
Clearly there would first need to be some predefined means of making inter league comparisons.
My first thought is that in sports like basketball or baseball the game outcome, to the great extent, depends on the sum of some "units" that reflect individual players' skills, while in sports like football or soccer a team-play has much larger effect. Thus, the former sports provide the means for inter league comparisons just by adding those "units".
League averages would by themselves be wholly insufficient.
True. Using league averages, which were not mentioned in this thread, while insufficient yet necesary for making this kind of predictions, no matter intra- or inter-league.
My first thought is that in sports like basketball or baseball the game outcome, to the great extent, depends on the sum of some "units" that reflect individual players' skills, while in sports like football or soccer a team-play has much larger effect. Thus, the former sports provide the means for inter league comparisons just by adding those "units".
True. Using league averages, which were not mentioned in this thread, while insufficient yet necesary for making this kind of predictions, no matter intra- or inter-league.
The whole point of the exercise outlined in this post is to use both Team A and B's averaging scoring (for and against), as well as that of their opponents (established versus league averages), to determine a fair money line.
It seems that you are talking about estimating win% that is essentially can be done using variety of methods, including Pythagorean, that use offensive/defensive efficiencies (ratings, scores etc). The OP and I are talking about estimating those efficiencies/ratings/whatever.
Perhaps this is overly simplistic, but would it be possible to estimate these values using a simple linear regression? Each previous game could be represented as:
Team_A_Off_Rating - Team_B_Def_Rating = Team A's Points For that Game (or points per 100 possessions for that game or something similar)
You could then run a regression to determine the values of each team's offensive and defensive ratings, adjusted for opponent quality. (Note that this would not produce actual points/game or efficiency values, but I think it would produce relative values that could be used to predict offensive efficiency for future matchups.) It seems to me that this would automatically account for points scored/allowed in previous games by both teams, as well as the points scored/allowed by their opponents, and their opponents' opponents, and so on...
Obviously, this wouldn't work for Ganchrow's example of the Knicks playing a DIII team, but I think that it could work for two teams that could be 'indirectly' linked through previous games.