1. #1
    tool21
    tool21's Avatar Become A Pro!
    Join Date: 01-25-08
    Posts: 102

    Math Model: how do you compare teams who played different difficulty opponents??

    I always like to include examples of what i'm talking about and this one is best described as one. I just want the BEST math behind this and the right way to do it. I have tried about 4 different methods and i have my favorites but i want to know how you guys would do it. This is just something small that i need to zero in on for accuracy.



    Answer this question and how do you explain how to do it.

    If Team A's offense averages 78.2 points against teams that give up on average 71.96667 and Team B's defense gives up 66.1 points against teams that put up on average 75.0875 then what is a good estimation about how many points Team A will put up based on the given information?

    Team A
    78.2
    71.9667

    Team B
    66.1
    75.0875

    The problem i'm coming up with is with the denominator. If team A difference is +6.233 and Team B difference is -8.98 then i come up with -2.75 against both teams opponents but which denominator (opp) do i add this to? Do i take the average of the denominators and then add it to get 70.78 points for Team A?

    Another method is this: Team A averages 78.2 against teams that give up on average 71.9667 but Team B doesn't give up 71.9667 they give up 66.1 so Team A off will have 71.825. On the other side, Team B gives up on average 66.1 against teams that average 75.0875 but Team A doesn't put up 75.0875 they put up 78.2 so Team B def. will give up 68.84. Then take the average of 68.84 and 71.825.

    Obviously there IS a distinct difference in the difficulty of the opponents and there is a difference between 68.84 points and 71.825 points in the 2nd example. Should i just stick to games in which the opponents difficulty is fairly the same? Divisonal games in NCAAB, and NBA mid-season? NBA isn't that much of a problem because the teams play all of each other. The denominators are fairly close which leads to less error but IMO NBA lines are a lot sharper than college lines. College is where the problem lies. Take a team like Davidson playing Kansas. Slight variations in opponents difficulty.

    Another problem between these 2 ways that i found is when two teams both like to run. Say you have Phoenix playing Denver. If you use the 2nd example method you are taking the averages which leads to a false total. Say team A is +6 and team B defense is +4 then you would have +10 to the teams they've played against. In example 2 the average would be taken between the two numbers and a false total would come out. So instead of having a 117-112 score (method 2) you should really have a 124-119 (method 1) type score. But like i asked before, which denominator do I add the numbers to? So I more or less just need help with the first example because the 2nd gives false totals....sorry for all the info just thinking on paper.


    Just throwing out some ideas, i have various others that tweak this....what would you guys do?
    Last edited by tool21; 04-17-08 at 07:36 PM.

  2. #2
    BuddyBear
    Update your status
    BuddyBear's Avatar Become A Pro!
    Join Date: 08-10-05
    Posts: 7,233
    Betpoints: 4805

    I didn't read the entire message b/c it's too long, but based on the first paragraph and the title of this thread you are interested in controlling for the quality of competition.

    I do not think the way you are considering this issue by looking at point differential is very valid but some might argue that point.

    Well, basically, you would need to develop a variable that measures for opponent's strength of schedule and treat that as a stand alone primary variable in your model or you could simply treat it as a control variable...it would be up to you how you wanted to use it but I certainly agree that it is important.

    IMO, this is an extremely difficult variable to construct because there are several validity issues that need to be considered before doing this. The biggest obstacle with this variable is that it is ALWAYS varying and the earlier in a season the more volatile it is going to be. Really, it won't become stagnant until you get deep into the season and everyone has played everyone.

    There are a number of reputable websites that offer strength of schedule measures that you can consider. It will save you a ton of time.

  3. #3
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    BuddyBear brings up some very good points.

    He's particularly correct that without prior knowledge of the sport in question (in other words, additional data and/or theory), point estimates of scoring expectations are by themselves generally insufficient to provide the answer for which you're looking.

    Off-the-cuff I'd say that to really go about this theoretically correctly ("BEST math", as you say) you'd first want to model the pdf of each team's scoring (for or against as applicable) conditioned on some measure of its opponent's performance. You'd further want to model a marginal (i.e., a "prior") probability distribution for each teams' (offensive or defensive) performance (assuming a one-to-one relationship existed between a team's marginal performance and it's scoring). Finally you'd utilize Bayesian inference to combine all your estimates in order to determine a posterior pdf for scoring by team A conditioned on the joint realization of your predetermined performance measures. By integrating across all possible performance measure realizations you'd then be able to calculate a final expectation. This would be what I'd consider a serious project, with likely considerably overreaching complexity.

    If, however, you'd settle for roughly estimating Team A's win probability vs. Team B, then there is a straightforward (if slightly involved) methodology that may be employed to this end. Still necessary, however, is the assumption that the opponents of Team A and B have themselves played against directly relatable opposition (consider that there'd be no way one would be able to draw meaningful conclusions about the Knicks' likely performance against a Division III team based solely on the two teams' results within their respective leagues).

    I'll warn you, however, that the results at best represent a first-order approximation far better suited to Rotisserie League play than to professional advantage betting. Nevertheless, the general techniques involved may be instructive and could potentially prove useful within alternate contexts.

    Given by OP:
    Team A's points for = 78.2
    Team A's opps' points against = 71.96667
    Team B's points against = 66.1
    Team B's opps' points for = 75.0875

    Let's add a few more variables to the problem specification, namely:
    Team A's points against = w
    Team A's opps' points for = x
    Team B's points for = y
    Team B's opps' points against = z

    Baseball's Bill James posited that the number of wins expected by a baseball team over some stretch of games could be reasonably estimated as a function solely of total runs scored and total runs allowed. This is known as the "Pythagorean expectation" and is given by the following formula:

    Expected MLB Wins = Games Played * RunsFor^n/(RunsFor^n + RunsAgainst^n)

    where n is a data-determined exponent equal to roughly 2 for Major League Baseball.

    It's certainly possible (although not necessarily advisable) to apply this same methodology to other sports. I'm going to demonstrate an extremely simple exponent estimation technique for the NBA. Disinterested readers won't lose much by skipping ahead to the conclusion.

    Using covers.com data one can derive the following data set through the 2006-07 season (also see Sheet1 of the attached spreadsheet):
    Team Points For Points Against Wins Games
    ATLANTA1991 9,489 9,466 45 87
    ATLANTA1992 8,711 8,834 38 82
    ATLANTA1993 9,101 9,214 43 85
    ATLANTA1994 9,321 8,855 63 93
    ATLANTA1995 8,189 8,116 42 85
    ATLANTA1996 8,315 8,262 46 85
    ATLANTA1997 8,204 7,813 56 87
    ATLANTA1998 8,210 7,916 51 86
    ATLANTA1999 5,032 4,912 34 59
    ATLANTA2000 7,735 8,176 28 82
    ATLANTA2001 7,459 7,886 25 82
    ATLANTA2002 7,711 8,058 33 82
    ATLANTA2003 7,714 8,006 35 82
    ATLANTA2004 7,611 7,992 28 82
    ATLANTA2005 7,605 8,401 13 82
    ATLANTA2006 7,972 8,362 26 82
    ATLANTA2007 7,680 8,070 30 82
    BOSTON1991 10,359 9,869 61 93
    BOSTON1992 9,816 9,518 57 92
    BOSTON1993 8,904 8,852 49 86
    BOSTON1994 8,267 8,618 32 82
    BOSTON1995 8,773 8,975 36 86
    BOSTON1996 8,002 8,258 30 77
    BOSTON1997 7,941 8,567 13 79
    BOSTON1998 7,864 8,079 36 82
    BOSTON1999 4,650 4,743 19 50
    BOSTON2000 8,146 8,208 35 82
    BOSTON2001 7,759 7,934 36 82
    BOSTON2002 9,361 9,135 58 98
    BOSTON2003 8,545 8,583 48 92
    BOSTON2004 8,149 8,335 36 86
    BOSTON2005 8,918 8,851 48 89
    BOSTON2006 8,033 8,159 33 82
    BOSTON2007 7,857 8,137 24 82
    CHARLOTTE2005 7,729 8,220 18 82
    CHARLOTTE2006 7,943 8,270 26 82
    CHARLOTTE2007 7,945 8,252 33 82
    CHICAGO1991 10,791 9,847 76 99
    CHICAGO1992 11,219 10,227 82 104
    CHICAGO1993 10,570 9,943 72 101
    CHICAGO1994 8,962 8,696 60 92
    CHICAGO1995 9,306 8,916 52 92
    CHICAGO1996 9,605 8,457 82 92
    CHICAGO1997 9,689 8,702 80 95
    CHICAGO1998 9,887 9,157 77 103
    CHICAGO1999 4,095 4,568 13 50
    CHICAGO2000 6,952 7,723 17 82
    CHICAGO2001 7,181 7,927 15 82
    CHICAGO2002 7,335 8,035 21 82
    CHICAGO2003 7,786 8,207 30 82
    CHICAGO2004 7,355 7,876 23 82
    CHICAGO2005 8,360 8,284 49 88
    CHICAGO2006 8,610 8,576 43 88
    CHICAGO2007 9,024 8,602 55 92
    CLEVELAND1991 8,343 8,545 33 82
    CLEVELAND1992 10,687 10,197 66 99
    CLEVELAND1993 9,675 9,165 57 91
    CLEVELAND1994 8,580 8,270 47 85
    CLEVELAND1995 7,747 7,729 44 86
    CLEVELAND1996 6,718 6,598 40 74
    CLEVELAND1997 6,867 6,713 40 78
    CLEVELAND1998 7,908 7,716 48 86
    CLEVELAND1999 4,322 4,408 22 50
    CLEVELAND2000 7,950 8,237 32 82
    CLEVELAND2001 7,561 7,909 30 82
    CLEVELAND2002 7,812 8,085 29 82
    CLEVELAND2003 7,495 8,284 17 82
    CLEVELAND2004 7,619 7,834 35 82
    CLEVELAND2005 7,914 7,849 42 82
    CLEVELAND2006 9,177 9,035 57 95
    CLEVELAND2007 9,710 9,353 62 102
    DALLAS1991 8,195 8,570 28 82
    DALLAS1992 8,007 8,634 22 82
    DALLAS1993 8,141 9,387 11 82
    DALLAS1994 7,811 8,514 13 82
    DALLAS1995 8,463 8,700 36 82
    DALLAS1996 5,632 5,846 19 55
    DALLAS1997 6,921 7,374 23 76
    DALLAS1998 7,494 7,995 20 82
    DALLAS1999 4,581 4,701 19 50
    DALLAS2000 8,316 8,363 40 82
    DALLAS2001 9,161 8,847 57 92
    DALLAS2002 9,496 9,142 61 90
    DALLAS2003 10,526 9,898 70 102
    DALLAS2004 9,124 8,753 53 87
    DALLAS2005 9,772 9,322 64 95
    DALLAS2006 10,424 9,832 74 105
    DALLAS2007 8,792 8,240 69 88
    DENVER1991 9,828 10,723 20 82
    DENVER1992 8,176 8,821 24 82
    DENVER1993 8,626 8,769 36 82
    DENVER1994 9,369 9,263 48 94
    DENVER1995 8,588 8,565 41 85
    DENVER1996 6,639 6,841 29 68
    DENVER1997 7,439 7,906 20 76
    DENVER1998 7,300 8,266 11 82
    DENVER1999 4,674 5,004 14 50
    DENVER2000 8,115 8,289 35 82
    DENVER2001 7,918 8,120 40 82
    DENVER2002 7,559 8,036 27 82
    DENVER2003 6,900 7,580 17 82
    DENVER2004 8,425 8,356 44 87
    DENVER2005 8,612 8,497 50 87
    DENVER2006 8,664 8,683 45 87
    DENVER2007 9,080 8,977 46 87
    DETROIT1991 9,721 9,470 57 97
    DETROIT1992 8,439 8,408 49 87
    DETROIT1993 8,252 8,366 40 82
    DETROIT1994 7,949 8,587 20 82
    DETROIT1995 8,053 8,651 28 82
    DETROIT1996 7,671 7,503 44 80
    DETROIT1997 7,719 7,324 52 82
    DETROIT1998 7,721 7,592 37 82
    DETROIT1999 4,914 4,758 31 55
    DETROIT2000 8,722 8,635 42 85
    DETROIT2001 7,837 7,976 32 82
    DETROIT2002 8,563 8,390 54 92
    DETROIT2003 9,021 8,702 58 99
    DETROIT2004 9,391 8,765 70 105
    DETROIT2005 9,911 9,476 69 107
    DETROIT2006 9,591 8,987 74 100
    DETROIT2007 9,334 8,946 63 98
    GOLDEN STATE1991 10,594 10,473 48 91
    GOLDEN STATE1992 10,200 9,878 56 86
    GOLDEN STATE1993 9,007 9,071 34 82
    GOLDEN STATE1994 9,192 9,069 50 85
    GOLDEN STATE1995 8,661 9,122 26 82
    GOLDEN STATE1996 7,653 7,759 34 75
    GOLDEN STATE1997 7,794 8,140 29 78
    GOLDEN STATE1998 7,237 7,985 19 82
    GOLDEN STATE1999 4,416 4,541 21 50
    GOLDEN STATE2000 7,834 8,512 19 82
    GOLDEN STATE2001 7,584 8,326 17 82
    GOLDEN STATE2002 8,009 8,452 21 82
    GOLDEN STATE2003 8,400 8,493 38 82
    GOLDEN STATE2004 7,649 7,709 37 82
    GOLDEN STATE2005 8,094 8,271 34 82
    GOLDEN STATE2006 8,076 8,187 34 82
    GOLDEN STATE2007 9,910 9,919 47 93
    HOUSTON1991 9,033 8,763 52 85
    HOUSTON1992 8,366 8,507 42 82
    HOUSTON1993 9,704 9,339 61 94
    HOUSTON1994 9,943 9,485 70 98
    HOUSTON1995 10,389 10,182 58 100
    HOUSTON1996 7,305 7,145 44 72
    HOUSTON1997 9,068 8,711 62 90
    HOUSTON1998 8,522 8,618 43 87
    HOUSTON1999 5,099 4,992 32 54
    HOUSTON2000 8,156 8,227 34 82
    HOUSTON2001 7,972 7,784 45 82
    HOUSTON2002 7,572 7,973 28 82
    HOUSTON2003 7,689 7,567 43 82
    HOUSTON2004 7,785 7,670 46 87
    HOUSTON2005 8,479 8,167 54 89
    HOUSTON2006 7,387 7,517 34 82
    HOUSTON2007 8,564 8,188 55 89
    INDIANA1991 9,751 9,785 43 87
    INDIANA1992 9,520 9,387 40 85
    INDIANA1993 9,247 9,107 42 86
    INDIANA1994 9,701 9,404 56 98
    INDIANA1995 9,836 9,467 63 99
    INDIANA1996 7,685 7,425 49 77
    INDIANA1997 7,487 7,467 36 79
    INDIANA1998 9,342 8,807 68 98
    INDIANA1999 5,950 5,716 42 63
    INDIANA2000 10,562 10,124 69 105
    INDIANA2001 7,940 7,981 42 86
    INDIANA2002 8,396 8,373 44 87
    INDIANA2003 8,487 8,235 50 88
    INDIANA2004 8,861 8,318 71 98
    INDIANA2005 8,718 8,694 50 95
    INDIANA2006 8,235 8,102 43 88
    INDIANA2007 7,840 8,040 35 82
    L.A. CLIPPERS1991 8,491 8,774 31 82
    L.A. CLIPPERS1992 8,931 8,863 47 87
    L.A. CLIPPERS1993 9,220 9,232 43 87
    L.A. CLIPPERS1994 8,447 8,916 27 82
    L.A. CLIPPERS1995 7,937 8,678 17 82
    L.A. CLIPPERS1996 7,463 7,754 26 75
    L.A. CLIPPERS1997 7,866 8,150 32 81
    L.A. CLIPPERS1998 7,865 8,469 17 82
    L.A. CLIPPERS1999 4,519 4,960 9 50
    L.A. CLIPPERS2000 7,546 8,491 15 82
    L.A. CLIPPERS2001 7,581 7,818 31 82
    L.A. CLIPPERS2002 7,844 7,884 39 82
    L.A. CLIPPERS2003 7,693 8,032 27 82
    L.A. CLIPPERS2004 7,770 8,147 28 82
    L.A. CLIPPERS2005 7,849 7,912 37 82
    L.A. CLIPPERS2006 9,238 9,064 54 94
    L.A. CLIPPERS2007 7,843 7,881 40 82
    L.A. LAKERS1991 10,691 10,117 70 101
    L.A. LAKERS1992 8,607 8,756 44 86
    L.A. LAKERS1993 9,041 9,154 41 87
    L.A. LAKERS1994 8,233 8,585 33 82
    L.A. LAKERS1995 9,521 9,591 53 92
    L.A. LAKERS1996 7,796 7,526 46 76
    L.A. LAKERS1997 8,024 7,760 52 81
    L.A. LAKERS1998 9,955 9,304 68 95
    L.A. LAKERS1999 5,702 5,574 34 58
    L.A. LAKERS2000 10,562 9,807 82 105
    L.A. LAKERS2001 9,905 9,424 71 98
    L.A. LAKERS2002 10,161 9,507 73 101
    L.A. LAKERS2003 9,433 9,239 56 94
    L.A. LAKERS2004 9,991 9,651 69 104
    L.A. LAKERS2005 8,095 8,338 34 82
    L.A. LAKERS2006 8,858 8,700 48 89
    L.A. LAKERS2007 8,964 9,022 43 87
    MEMPHIS1996 6,515 7,207 13 72
    MEMPHIS1997 6,667 7,474 12 75
    MEMPHIS1998 7,923 8,522 19 82
    MEMPHIS1999 4,443 4,876 8 50
    MEMPHIS2000 7,702 8,163 22 82
    MEMPHIS2001 7,522 7,992 23 82
    MEMPHIS2002 7,366 7,978 23 82
    MEMPHIS2003 7,995 8,260 28 82
    MEMPHIS2004 8,264 8,120 50 86
    MEMPHIS2005 8,072 7,928 45 86
    MEMPHIS2006 7,895 7,648 49 86
    MEMPHIS2007 8,331 8,753 22 82
    MIAMI1991 8,349 8,840 24 82
    MIAMI1992 8,906 9,305 38 85
    MIAMI1993 8,495 8,599 36 82
    MIAMI1994 8,924 8,739 44 87
    MIAMI1995 8,293 8,428 32 82
    MIAMI1996 7,051 6,954 36 73
    MIAMI1997 8,773 8,314 66 94
    MIAMI1998 8,224 7,831 57 87
    MIAMI1999 4,844 4,616 35 55
    MIAMI2000 8,571 8,291 58 92
    MIAMI2001 7,524 7,403 50 85
    MIAMI2002 7,150 7,274 36 82
    MIAMI2003 7,016 7,430 25 82
    MIAMI2004 8,495 8,450 48 95
    MIAMI2005 9,797 9,198 70 97
    MIAMI2006 10,405 10,001 68 105
    MIAMI2007 8,114 8,233 44 86
    MILWAUKEE1991 9,029 8,860 48 85
    MILWAUKEE1992 8,608 8,749 31 82
    MILWAUKEE1993 8,391 8,699 28 82
    MILWAUKEE1994 7,949 8,480 20 82
    MILWAUKEE1995 8,160 8,497 34 82
    MILWAUKEE1996 7,451 7,878 24 78
    MILWAUKEE1997 7,142 7,291 29 75
    MILWAUKEE1998 7,748 7,905 36 82
    MILWAUKEE1999 4,870 4,818 28 53
    MILWAUKEE2000 8,780 8,753 44 87
    MILWAUKEE2001 9,982 9,639 62 100
    MILWAUKEE2002 7,996 8,014 41 82
    MILWAUKEE2003 8,744 8,752 44 88
    MILWAUKEE2004 8,467 8,442 42 87
    MILWAUKEE2005 7,973 8,218 30 82
    MILWAUKEE2006 8,508 8,641 41 87
    MILWAUKEE2007 8,172 8,531 28 82
    MINNESOTA1991 8,169 8,491 29 82
    MINNESOTA1992 8,237 8,815 15 82
    MINNESOTA1993 8,042 8,684 19 82
    MINNESOTA1994 7,930 8,498 20 82
    MINNESOTA1995 7,716 8,464 21 82
    MINNESOTA1996 6,886 7,332 20 71
    MINNESOTA1997 7,703 7,855 38 80
    MINNESOTA1998 8,741 8,712 47 87
    MINNESOTA1999 4,969 4,975 26 54
    MINNESOTA2000 8,420 8,221 51 86
    MINNESOTA2001 8,310 8,225 48 86
    MINNESOTA2002 8,451 8,206 50 85
    MINNESOTA2003 8,648 8,516 53 88
    MINNESOTA2004 9,408 8,959 68 100
    MINNESOTA2005 7,934 7,815 44 82
    MINNESOTA2006 7,522 7,676 33 82
    MINNESOTA2007 7,877 8,178 32 82
    NEW JERSEY1991 8,441 8,811 26 82
    NEW JERSEY1992 9,048 9,220 41 86
    NEW JERSEY1993 8,899 8,819 45 87
    NEW JERSEY1994 8,807 8,656 46 86
    NEW JERSEY1995 8,042 8,299 30 82
    NEW JERSEY1996 6,943 7,249 28 74
    NEW JERSEY1997 7,432 7,791 24 76
    NEW JERSEY1998 8,461 8,353 43 85
    NEW JERSEY1999 4,569 4,758 16 50
    NEW JERSEY2000 8,036 8,120 31 82
    NEW JERSEY2001 7,552 7,966 26 82
    NEW JERSEY2002 9,796 9,456 63 102
    NEW JERSEY2003 9,693 9,198 63 102
    NEW JERSEY2004 8,371 8,139 54 93
    NEW JERSEY2005 7,883 8,057 42 86
    NEW JERSEY2006 8,727 8,625 54 93
    NEW JERSEY2007 9,083 9,124 47 94
    NEW ORLEANS1991 8,428 8,858 26 82
    NEW ORLEANS1992 8,980 9,300 31 82
    NEW ORLEANS1993 9,930 9,955 48 91
    NEW ORLEANS1994 8,732 8,750 41 82
    NEW ORLEANS1995 8,619 8,366 51 86
    NEW ORLEANS1996 7,517 7,595 36 73
    NEW ORLEANS1997 7,634 7,496 50 77
    NEW ORLEANS1998 8,668 8,558 55 91
    NEW ORLEANS1999 4,644 4,649 26 50
    NEW ORLEANS2000 8,437 8,229 50 86
    NEW ORLEANS2001 8,496 8,261 52 92
    NEW ORLEANS2002 8,565 8,486 48 91
    NEW ORLEANS2003 8,256 8,093 49 88
    NEW ORLEANS2004 8,092 8,122 44 89
    NEW ORLEANS2005 7,252 7,832 18 82
    NEW ORLEANS2006 7,611 7,842 38 82
    NEW ORLEANS2007 7,833 7,962 39 82
    NEW YORK1991 8,713 8,792 39 85
    NEW YORK1992 9,411 9,080 57 94
    NEW YORK1993 9,795 9,293 69 97
    NEW YORK1994 9,695 9,084 69 100
    NEW YORK1995 9,081 8,782 61 93
    NEW YORK1996 7,589 7,410 46 79
    NEW YORK1997 8,265 7,999 60 87
    NEW YORK1998 8,395 8,215 47 92
    NEW YORK1999 6,020 5,929 39 70
    NEW YORK2000 8,906 8,803 59 98
    NEW YORK2001 7,720 7,520 50 87
    NEW YORK2002 7,514 7,839 30 82
    NEW YORK2003 7,860 7,971 37 82
    NEW YORK2004 7,878 8,050 39 86
    NEW YORK2005 7,977 8,177 33 82
    NEW YORK2006 7,842 8,367 23 82
    NEW YORK2007 7,994 8,228 33 82
    ORLANDO1991 8,684 9,010 31 82
    ORLANDO1992 8,330 8,897 21 82
    ORLANDO1993 8,653 8,543 41 82
    ORLANDO1994 8,941 8,638 50 85
    ORLANDO1995 10,790 10,200 67 99
    ORLANDO1996 8,538 8,113 61 82
    ORLANDO1997 7,577 7,658 42 81
    ORLANDO1998 7,387 7,475 41 82
    ORLANDO1999 4,818 4,713 34 54
    ORLANDO2000 8,206 8,150 41 82
    ORLANDO2001 8,403 8,345 44 86
    ORLANDO2002 8,615 8,506 45 86
    ORLANDO2003 8,690 8,731 45 89
    ORLANDO2004 7,711 8,287 21 82
    ORLANDO2005 8,160 8,344 36 82
    ORLANDO2006 7,784 7,872 36 82
    ORLANDO2007 8,123 8,095 40 86
    PHILADELPHIA1991 9,448 9,473 48 90
    PHILADELPHIA1992 8,358 8,462 35 82
    PHILADELPHIA1993 8,556 9,029 26 82
    PHILADELPHIA1994 8,033 8,658 25 82
    PHILADELPHIA1995 7,820 8,236 24 82
    PHILADELPHIA1996 7,450 8,275 16 79
    PHILADELPHIA1997 8,131 8,658 22 81
    PHILADELPHIA1998 7,651 7,847 31 82
    PHILADELPHIA1999 5,197 5,090 31 58
    PHILADELPHIA2000 8,713 8,616 54 92
    PHILADELPHIA2001 9,887 9,538 68 105
    PHILADELPHIA2002 7,906 7,819 45 87
    PHILADELPHIA2003 9,046 8,847 54 94
    PHILADELPHIA2004 7,215 7,419 33 82
    PHILADELPHIA2005 8,582 8,683 44 87
    PHILADELPHIA2006 8,147 8,307 38 82
    PHILADELPHIA2007 7,785 8,033 35 82
    PHOENIX1991 9,731 9,240 56 86
    PHOENIX1992 10,142 9,644 57 90
    PHOENIX1993 11,813 11,257 75 106
    PHOENIX1994 9,923 9,578 61 92
    PHOENIX1995 10,183 9,824 65 92
    PHOENIX1996 7,681 7,679 38 74
    PHOENIX1997 8,623 8,660 39 84
    PHOENIX1998 8,538 8,143 57 86
    PHOENIX1999 5,056 4,974 27 53
    PHOENIX2000 8,897 8,502 57 91
    PHOENIX2001 8,064 7,920 52 86
    PHOENIX2002 7,802 7,856 36 82
    PHOENIX2003 8,345 8,283 46 88
    PHOENIX2004 7,723 8,029 29 82
    PHOENIX2005 10,734 10,087 71 97
    PHOENIX2006 11,030 10,551 64 102
    PHOENIX2007 10,182 9,528 67 93
    PORTLAND1991 11,069 10,336 72 98
    PORTLAND1992 11,444 10,778 70 103
    PORTLAND1993 9,297 9,033 52 86
    PORTLAND1994 9,210 9,015 48 86
    PORTLAND1995 8,752 8,491 44 85
    PORTLAND1996 7,102 7,051 36 72
    PORTLAND1997 7,909 7,573 49 80
    PORTLAND1998 8,133 8,035 47 86
    PORTLAND1999 5,862 5,550 42 63
    PORTLAND2000 9,470 8,870 69 98
    PORTLAND2001 8,091 7,791 50 85
    PORTLAND2002 8,191 7,965 49 85
    PORTLAND2003 8,511 8,290 53 89
    PORTLAND2004 7,439 7,544 41 82
    PORTLAND2005 7,621 7,949 27 82
    PORTLAND2006 7,285 8,060 21 82
    PORTLAND2007 7,717 8,069 32 82
    SACRAMENTO1991 7,928 8,484 25 82
    SACRAMENTO1992 8,549 9,046 29 82
    SACRAMENTO1993 8,847 9,103 25 82
    SACRAMENTO1994 8,291 8,764 28 82
    SACRAMENTO1995 8,060 8,134 39 82
    SACRAMENTO1996 7,031 7,264 31 71
    SACRAMENTO1997 7,238 7,508 30 75
    SACRAMENTO1998 7,641 8,101 27 82
    SACRAMENTO1999 5,462 5,507 29 55
    SACRAMENTO2000 9,089 8,890 46 87
    SACRAMENTO2001 9,123 8,646 58 90
    SACRAMENTO2002 10,193 9,535 71 98
    SACRAMENTO2003 9,635 9,076 66 94
    SACRAMENTO2004 9,575 9,163 62 94
    SACRAMENTO2005 9,016 8,861 51 87
    SACRAMENTO2006 8,690 8,621 46 88
    SACRAMENTO2007 8,303 8,451 33 82
    SAN ANTONIO1991 9,213 8,863 56 86
    SAN ANTONIO1992 8,834 8,589 47 85
    SAN ANTONIO1993 9,659 9,439 54 92
    SAN ANTONIO1994 8,554 8,156 56 86
    SAN ANTONIO1995 10,219 9,659 71 97
    SAN ANTONIO1996 8,473 8,037 56 83
    SAN ANTONIO1997 6,182 6,691 17 67
    SAN ANTONIO1998 8,413 8,057 60 91
    SAN ANTONIO1999 6,143 5,617 52 67
    SAN ANTONIO2000 8,213 7,731 54 86
    SAN ANTONIO2001 9,076 8,445 65 95
    SAN ANTONIO2002 8,843 8,304 62 92
    SAN ANTONIO2003 10,131 9,555 76 106
    SAN ANTONIO2004 8,394 7,770 63 92
    SAN ANTONIO2005 10,117 9,377 75 105
    SAN ANTONIO2006 9,177 8,589 70 95
    SAN ANTONIO2007 9,992 9,222 74 102
    SEATTLE1991 9,262 9,175 43 87
    SEATTLE1992 9,687 9,462 52 91
    SEATTLE1993 10,788 10,181 65 101
    SEATTLE1994 9,161 8,414 65 87
    SEATTLE1995 9,444 8,748 58 86
    SEATTLE1996 8,955 8,324 68 87
    SEATTLE1997 9,438 8,751 63 93
    SEATTLE1998 9,198 8,634 65 92
    SEATTLE1999 4,743 4,797 25 50
    SEATTLE2000 8,591 8,519 47 87
    SEATTLE2001 7,978 7,976 44 82
    SEATTLE2002 8,445 8,248 47 87
    SEATTLE2003 7,555 7,565 40 82
    SEATTLE2004 7,964 8,016 37 82
    SEATTLE2005 9,197 9,028 58 93
    SEATTLE2006 8,411 8,659 35 82
    SEATTLE2007 8,130 8,367 31 82
    TORONTO1996 6,995 7,524 17 72
    TORONTO1997 7,101 7,310 28 74
    TORONTO1998 7,781 8,541 16 82
    TORONTO1999 4,557 4,639 23 50
    TORONTO2000 8,219 8,244 45 85
    TORONTO2001 9,113 8,917 53 94
    TORONTO2002 7,913 7,963 44 87
    TORONTO2003 7,453 7,934 24 82
    TORONTO2004 7,006 7,253 33 82
    TORONTO2005 8,178 8,311 33 82
    TORONTO2006 8,286 8,532 27 82
    TORONTO2007 8,702 8,653 49 88
    UTAH1991 9,473 9,180 58 91
    UTAH1992 10,523 9,993 64 98
    UTAH1993 9,145 8,988 49 87
    UTAH1994 9,872 9,500 61 98
    UTAH1995 9,246 8,609 62 87
    UTAH1996 9,321 8,705 59 92
    UTAH1997 9,466 8,721 72 92
    UTAH1998 10,058 9,480 75 102
    UTAH1999 5,647 5,301 42 61
    UTAH2000 8,797 8,480 59 92
    UTAH2001 8,407 8,043 55 87
    UTAH2002 8,223 8,154 45 86
    UTAH2003 8,227 8,084 48 87
    UTAH2004 7,271 7,371 42 82
    UTAH2005 7,625 7,975 26 82
    UTAH2006 7,573 7,789 41 82
    UTAH2007 9,986 9,736 60 99
    WASHINGTON1991 8,313 8,721 30 82
    WASHINGTON1992 8,395 8,761 25 82
    WASHINGTON1993 8,353 8,930 22 82
    WASHINGTON1994 8,229 8,834 24 82
    WASHINGTON1995 8,242 8,701 21 82
    WASHINGTON1996 6,770 6,781 28 66
    WASHINGTON1997 8,215 8,138 42 83
    WASHINGTON1998 7,969 7,921 42 82
    WASHINGTON1999 4,560 4,672 18 50
    WASHINGTON2000 7,921 8,190 29 82
    WASHINGTON2001 7,645 8,192 19 82
    WASHINGTON2002 7,609 7,724 37 82
    WASHINGTON2003 7,502 7,585 37 82
    WASHINGTON2004 7,527 7,990 25 82
    WASHINGTON2005 9,245 9,297 49 92
    WASHINGTON2006 8,946 8,793 44 88
    WASHINGTON2007 8,922 8,999 41 86
    The goal is to select an exponent, n, for the modified Pythagorean expectation equation
    Expected NBA Wins ≈ Games Played * PointsFor^n/(PointsFor^n + PointsAgainst^n)
    that best "fits" the actual data set. You should note that by dividing through by "Games Played" we then have an estimator of the probability of a team winning a particular game given the expected number of points scored for and against.

    When attempting to match actual frequencies with predicted expectations, one frequently uses what's known as "logistic modeling". Realize that probabilities are necessarily bounded by 0 and 1 and are inherently nonlinear in nature. This can be intuitively understood by considering the difference between a 100% probability event overvalued at 99.01% (US: -10,000) versus a 50% probability event overvalued at 49.50% (US: +102). Both events are overvalued by 1%, but in the former case a growth-maximizing investor should be willing to bet and borrow every penny possibly available to him, while in the latter case such a bettor would invest less than 1% of his bankroll.

    Logistic modeling defines the following function of a probability, p:
    logit(p) = ln(p) - ln(1-p)
    Another way of saying this is that the logit() of a probability is the logarithm of the "fair" fractional odds (i.e., decimal odds-1) associated with that probability. So in other words, the logit() of an event that occurs exactly half of the time (p=50%) would be 0, the logit() of an event that always occurs (p=100%) would be +∞, and the logit() of an event that never occurs (p=0%) would be -∞.

    We then proceed by determining the exponent that minimizes the sum of the squared deviations between the logit of the actual win percentage and logit of the Pythagorean expectation. This is a straightforward problem of convex optimization well within the capabilities of Microsoft Excel Solver. If you look at Sheet1 of the attached spreadsheet you'll see that the exponent value that provides the best fit for the data set is roughly 14.02. Ths is frighteningly close to the value of 14 attributed to Dean Oliver by Wikipedia.

    Determination of relative goodness-of-fit for this and other sports are left as an exercise for the interested reader (translation: don't expect me to chime in, but please see this paper if particularly motivated).
    So we have:

    NBA Win Probability ≈ PointsFor14.02/(PointsFor14.02 + PointsAgainst14.02)

    Because the OP wasn't so kind as to provide values for the variables w, x, y, and z I'll arbitrarily assign them as follows:

    w = 74
    x = 73
    y = 68
    z = 72

    Using the Pythagorean expectation formula above we come up with predicted win probabilities as follows (see Sheet2 of the attached spreadsheet):

    Team A: 68.44%
    Team A opp: 54.98%
    Team B: 59.81%
    Team B opp: 64.31%
    If we assume that Team A & B's opponents have themselves established their win records against "average" opposition we can then impute Team A and B's respective performances against average opposition. See the following post I had made at another site for a brief discussion of the general methodology:
    Quote Originally Posted by Ganchrow
    Quote Originally Posted by trytowin
    I see on nba.com in the standings section that Boston has a home win percentage of 0.87879 (29-4), and that Utah has a road win percentage of 0.42857 (15-20). The question is, what is the expected probability based on these two numbers only that Boston will win tonight's game?

    a) 0.5 - 0.42857 = 0.07143 + 0.87879 ==> 0.95022
    b) 0.42857 + 0.87879 / 2 ==> 0.65368
    c) (0.87879 - 0.42857) * 2 ==> 0.90044
    d) none of the above

    I not sure of the correct answer, so please explain. Thx, TTW
    The Sabermetrics guys have a methodology for handling this called "log5", but it assumes a league average home team win probability of 50%.

    Just thinking about it logically, here's how I'd do it:
    Let A = Team A home win % (taken as an unbiased estimator of Boston's "true" home win probability versus an "average" opponent) = 29/33
    Let B = Team B road win% (taken as an unbiased estimator of Utah's "true" road win probability versus an "average" opponent) = 15/35
    Let H* = expected home win probability between two equally matched teams = 61% (which is roughly the total NBA home win win frequency over the past 25 years)
    Let P = expected probability of A beating B playing at A's Home
    Define the logit function of a probability p as the log of the "fair" fractional payout odds p/(1-p) so that:
    logit(p) = ln(p) - ln(1-p)
    inverted:
    p = exp(logit(p))/(1+exp(logit(p)))
    This gives us:
    logit(A) = ln(29) - ln(4) ≈ 1.981001469
    logit(B) = ln(15) - ln(20) ≈ -0.287682072
    logit(H*) = ln(61%) - ln(39%) ≈ 0.447312218
    So from Bayes' Theorem:
    logit(P) = logit(A) - logit(B) - logit(H*)
    logit(P) ≈ 1.981001469 - -0.287682072 - 0.447312218 ≈ 1.821371323
    Solving for P we have:
    exp(logit(P)) ≈ exp(1.246007179) ≈ 6.180327869
    P = 6.180327869 / (1+6.180327869) ≈ 86.073%
    So based on the above data, Boston's win probability over Utah was about 86.073%, implying fair US payout odds of about -618.
    (You can also see a preview of the following not-yet-ready-for-prime-time version of a calculator that performs these calculations here. Final version to be released soon.)

    So we have:
    logit(A win prob vs. Avg. Opp) = logit(68.44%) - logit(54.98%) = 0.974158898
    logit(B win prob vs. Avg. Opp) = logit(59.81%) - logit(64.31%) = 0.98630472

    This then yields:
    logit(A win prob vs. B) = logit(A win prob vs. Avg. Opp) - logit(B win prob vs. Avg. Opp) = 0.974158898 - 0.98630472 = -0.012145822

    Inverting the logit function yields what's known as the logistic function:
    p = 1 / (1+exp(-logit(p))

    giving us a final value for the probability of A defeating B of:
    p = 1 / (1+exp(0.012145822)) ≈ 49.70%

    (If using the alpha-version calculator linked above, you'd enter the team in question's win probability into the "Home Team" box, 1-opponent's win probability into the "Away Team" box, and 50% into the "League Average" box. This will automatically perform the logit calculations. Comments are welcome.)
    Attached Files

  4. #4
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    Quote Originally Posted by Ganchrow View Post
    If, however, you'd settle for roughly estimating Team A's win probability vs. Team B, then there is a straightforward (if slightly involved) methodology that may be employed to this end. Still necessary, however, is the assumption that the opponents of Team A and B have themselves played against directly relatable opposition (consider that there'd be no way one would be able to draw meaningful conclusions about the Knicks' likely performance against a Division III team based solely on the two teams' results within their respective leagues).
    I think that in basketball one can get at least better than meaningless assumptions based on league averages. I would be also interested to hear from european players chiming in on eurocups linemaking.

  5. #5
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    Quote Originally Posted by Data View Post
    I think that in basketball one can get at least better than meaningless assumptions based on league averages.
    Please explain how this is possible within the context of the provided example (determining fair market line between the Knicks' and a Division III team based solely on the two teams' results within their respective leagues).

    Clearly there would first need to be some predefined means of making inter league comparisons. League averages would by themselves be wholly insufficient.

  6. #6
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    Clearly there would first need to be some predefined means of making inter league comparisons.
    My first thought is that in sports like basketball or baseball the game outcome, to the great extent, depends on the sum of some "units" that reflect individual players' skills, while in sports like football or soccer a team-play has much larger effect. Thus, the former sports provide the means for inter league comparisons just by adding those "units".

    League averages would by themselves be wholly insufficient.
    True. Using league averages, which were not mentioned in this thread, while insufficient yet necesary for making this kind of predictions, no matter intra- or inter-league.

  7. #7
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    Quote Originally Posted by Data View Post
    My first thought is that in sports like basketball or baseball the game outcome, to the great extent, depends on the sum of some "units" that reflect individual players' skills, while in sports like football or soccer a team-play has much larger effect. Thus, the former sports provide the means for inter league comparisons just by adding those "units".

    True. Using league averages, which were not mentioned in this thread, while insufficient yet necesary for making this kind of predictions, no matter intra- or inter-league.
    The whole point of the exercise outlined in this post is to use both Team A and B's averaging scoring (for and against), as well as that of their opponents (established versus league averages), to determine a fair money line.

  8. #8
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    It seems that you are talking about estimating win% that is essentially can be done using variety of methods, including Pythagorean, that use offensive/defensive efficiencies (ratings, scores etc). The OP and I are talking about estimating those efficiencies/ratings/whatever.

  9. #9
    butters
    butters's Avatar Become A Pro!
    Join Date: 07-10-07
    Posts: 5

    Perhaps this is overly simplistic, but would it be possible to estimate these values using a simple linear regression? Each previous game could be represented as:

    Team_A_Off_Rating - Team_B_Def_Rating = Team A's Points For that Game (or points per 100 possessions for that game or something similar)

    You could then run a regression to determine the values of each team's offensive and defensive ratings, adjusted for opponent quality. (Note that this would not produce actual points/game or efficiency values, but I think it would produce relative values that could be used to predict offensive efficiency for future matchups.) It seems to me that this would automatically account for points scored/allowed in previous games by both teams, as well as the points scored/allowed by their opponents, and their opponents' opponents, and so on...

    Obviously, this wouldn't work for Ganchrow's example of the Knicks playing a DIII team, but I think that it could work for two teams that could be 'indirectly' linked through previous games.

Top