A guy hits 54% of 2000 plays .......

**Justin7** · 01-24-09, 12:22 AM

Are you assuming he will hit 54% going forward? We need more info.

**Arilou** · 01-24-09, 09:02 AM

I did this quickly since the details aren't too important. I believe the minimum probability is roughly 26.8%. If you assume that on average the player is less than 54% going forward, which is a reasonable assumption since it's more likely he is a 50% capper than a 58% capper and thus you can't actually be centered normally around 54%, it becomes somewhat more. In practice, assuming he keeps on doing exactly what he was doing and market conditions don't change I'd probably cap it at about 2:1.

**Ganchrow** · 01-24-09, 10:06 AM

Solving this is a straightforward excersise in what's known as Bayesian inference, which is a topic into which I've frequently delved on this forum.

The idea is that we use observed information (54% over the last 2,000 plays) to update our prior beliefs about this better (prior beliefs unspecified in this problem). This yields us what's known as a posterior distribution of our handicapper's "true pick probability".

Directly from Bayes' formula we have that the equation for our posterior distribution would be as follows:

Pr(p*=X | 55% over 2,000) = Pr(p*=X) * Pr(55% over 2,000 | p*=X) / P(55% over 2,000)

Armed with this posterior distribution we then take the summation across all possible values of true pick probability (i.e., from 0 to 1, so actually, because our variable is continuous we'd in fact dealing with an integral) of the likelihood that that probability is our bettor's "true" probability, times the likelihood of the bettor going 55% or better picks given that probability.

So for example, we'd take the probability that the bettor were a 50% picker, times the probability that a 50% picker would go 55% over 2,000 picks, plus the probability that the bettor were a 50.1% picker, times the probability that a 50.1% picker would go 55% over 2,000 picks, plus the probability that the bettor were a 50.2% picker, times the probability that a 50.2% picker would go 55% over 2,000 picks, etc. (And remember we'd be doing this over all possible vales of p, from 0 to 1.)

This summation (integration) would then yield the answer to your question.

So let's just give a simple example using an extremely oversimplified prior distribution.

Let's say that in the sample from which the handicapper was drawn the prior likelihoods, q, for each of the following (discrete) true pick probabilities, p, are given in the below chart:

CHART 1:

Code:

p	q
45%	0.0733%
46%	0.2933%
47%	1.1730%
48%	4.6921%
49%	18.7683%
50%	50.0000%
51%	18.7500%
52%	4.6875%
53%	1.1719%
54%	0.2930%
55%	0.0732%
56%	0.0183%
57%	0.0046%
58%	0.0011%
59%	0.0003%
60%	0.0001%

(Please note that these above numbers were selected without much thought, although sklightly tempered by my prior knowledge of handicapping. They sit quite far from gospel, and readers should feel free to modify these numbers in nay way they see fit going forward).

This means that a handicapper selected at random 75% probability of being a 50% picker or better, and a 0.2930%+0.0732%+0.0183%+0.0046%+0.0011%+ 0.0003%+0.0001% = 0.3906% probability of being a 54% picker or better.

So recalling that Pr(p*=X | 55% over 2,000) = Pr(p*=X) * Pr(55% over 2,000 | p*=X) / P(55% over 2,000)

Let's first derive the denominator, as its value isn't dependent on any specific realized values.

P(55% over 2,000)=
SUM(i=45% to 60%) { Pr(p=i) * Pr(55% over 2,000 | p*=i) }
= 0.0733%*BINOMDIST(1100,2000,45%)
+ 0.2933%*BINOMDIST(1100,2000,46%)
+ ...
+ 50.0000%*BINOMDIST(1100,2000,50%)
+ 18.7500%*BINOMDIST(1100,2000,51%)
+ 18.7500%*BINOMDIST(1100,2000,52%)
+ 4.6875%*BINOMDIST(1100,2000,53%)
+ ...
≈ 1.2118*10^-4

Similarly:

CHART 2:

Code:

X	Pr(p*=X)	Pr(55% over 2,000 | p*=X)
45%	 0.0733%	0.0000
46%	 0.2933%	0.0000
47%	 1.1730%	1.3160*10[sup]-13[/sup]
48%	 4.6921%	5.3914*10[sup]-11[/sup]
49%	18.7683%	9.8215*10[sup]-9[/sup]
50%	50.0000%	8.0046*10[sup]-7[/sup]
51%	18.7500%	2.9309*10[sup]-5[/sup]
52%	 4.6875%	4.8320*10[sup]-4[/sup]
53%	 1.1719%	3.5881*10[sup]-3[/sup]
54%	 0.2930%	1.1982*10[sup]-2[/sup]
55%	 0.0732%	1.7929*10[sup]-2[/sup]
56%	 0.0183%	1.1956*10[sup]-2[/sup]
57%	 0.0046%	3.5260*10[sup]-3[/sup]
58%	 0.0011%	4.5550*10[sup]-4[/sup]
59%	 0.0003%	2.5469*10[sup]-5[/sup]
60%	 0.0001%	6.0771*10[sup]-7[/sup]

So given that Pr(p*=X | 55% over 2,000) = Pr(p*=X) * Pr(55% over 2,000 | p*=X) / P(55% over 2,000) we can then determine our posterior distribution for true pick rate, p*:

CHART 3:

Code:

X	Pr(p*=X|55% over 2,000)
45%	0.0000%
46%	0.0000%
47%	1.2738*10[sup]-11[/sup]
48%	2.0875*10[sup]-8[/sup]
49%	0.0015%
50%	0.3303%
51%	4.5347%
52%	18.6903%
53%	34.6972%
54%	28.9661%
55%	10.8359%
56%	1.8064%
57%	0.1332%
58%	0.0043%
59%	6.0130*10[sup]-7[/sup]
60%	3.5868*10[sup]-9[/sup]

(This, btw, would imply an expected true pick probability of 53.2786% for this bettor.)

So to solve the problem we now need to calculate the probabilities of going 52% over one's next 200 picks given each possible pick probability:
Chart 4:

Code:

X	Pr(p*=X |55%)	Pr(<=52% over 200 | p*=X)
45%	 0.0000%	98.0112%
46%	 0.0000%	96.1719%
47%	1.2738E-11	93.1445%
48%	2.0875E-08	88.5482%
49%	 0.0015%	82.1066%
50%	 0.3303%	73.7689%
51%	 4.5347%	63.7986%
52%	18.6903%	52.7822%
53%	34.6972%	41.5357%
54%	28.9661%	30.9299%
55%	10.8359%	21.6950%
56%	 1.8064%	14.2748%
57%	 0.1332%	8.7776%
58%	 0.0043%	5.0265%
59%	6.0130E-07	2.6718%
60%	3.5868E-09	1.3140%

So taking the dot product (i.e., the sum of the products of each paired item) of the 2nd and 3rd vectors in chart 4 (and assuming I've not made any silly mistakes) yields a probabiliy of roughly 38.9947%, corresponding to US-style odds of about +156.4.

And for those of you who might comment that the achieved results were highly dependent on our estimates of our prior distribution, all I can really say is, yes, you're right. This is why coming up prior estimates (i.e., a baseline from which to interpret new information) is so incredibly important.

**raiders72002** · 01-24-09, 10:27 AM

Originally posted by Ganchrow

Solving this is a straightforward excersise in what's known as Bayesian inference, which is a topic into which I've frequently delved on this forum.

The idea is that we use observed information (54% over the last 2,000 plays) to update our prior beliefs about this better (prior beliefs unspecified in this problem). This yields us what's known as a posterior distribution of our handicapper's "true pick probability".

Directly from Bayes' formula we have that the equation for our posterior distribution would be as follows:

Pr(p*=X | 55% over 2,000) = Pr(p*=X) * Pr(55% over 2,000 | p*=X) / P(55% over 2,000)

Armed with this posterior distribution we then take the summation across all possible values of true pick probability (i.e., from 0 to 1, so actually, because our variable is continuous we'd in fact dealing with an integral) of the likelihood that that probability is our bettor's "true" probability, times the likelihood of the bettor going 55% or better picks given that probability.

So for example, we'd take the probability that the bettor were a 50% picker, times the probability that a 50% picker would go 55% over 2,000 picks, plus the probability that the bettor were a 50.1% picker, times the probability that a 50.1% picker would go 55% over 2,000 picks, plus the probability that the bettor were a 50.2% picker, times the probability that a 50.2% picker would go 55% over 2,000 picks, etc. (And remember we'd be doing this over all possible vales of p, from 0 to 1.)

This summation (integration) would then yield the answer to your question.

So let's just give a simple example using an extremely oversimplified prior distribution.

Let's say that in the sample from which the handicapper was drawn the prior likelihoods, q, for each of the following (discrete) true pick probabilities, p, are given in the below chart:

CHART 1:

Code:

p    q
45%    0.0733%
46%    0.2933%
47%    1.1730%
48%    4.6921%
49%    18.7683%
50%    50.0000%
51%    18.7500%
52%    4.6875%
53%    1.1719%
54%    0.2930%
55%    0.0732%
56%    0.0183%
57%    0.0046%
58%    0.0011%
59%    0.0003%
60%    0.0001%

(Please note that these above numbers were selected without much thought, although sklightly tempered by my prior knowledge of handicapping. They sit quite far from gospel, and readers should feel free to modify these numbers in nay way they see fit going forward).

This means that a handicapper selected at random 75% probability of being a 50% picker or better, and a 0.2930%+0.0732%+0.0183%+0.0046%+0.0011%+ 0.0003%+0.0001% = 0.3906% probability of being a 54% picker or better.

So recalling that Pr(p*=X | 55% over 2,000) = Pr(p*=X) * Pr(55% over 2,000 | p*=X) / P(55% over 2,000)

Let's first derive the denominator, as its value isn't dependent on any specific realized values.

P(55% over 2,000)=
SUM(i=45% to 60%) { Pr(p=i) * Pr(55% over 2,000 | p*=i) }
= 0.0733%*BINOMDIST(1080,2000,45%)
+ 0.2933%*BINOMDIST(1080,2000,46%)
+ ...
+ 50.0000%*BINOMDIST(1080,2000,50%)
+ 18.7500%*BINOMDIST(1080,2000,51%)
+ 18.7500%*BINOMDIST(1080,2000,52%)
+ 4.6875%*BINOMDIST(1080,2000,53%)
+ ...
≈ 1.2118*10^-4

Similarly:

CHART 2:

Code:

X    Pr(p*=X)    Pr(55% over 2,000 | p*=X)
45%     0.0733%    0.0000
46%     0.2933%    0.0000
47%     1.1730%    1.3160*10[sup]-13[/sup]
48%     4.6921%    5.3914*10[sup]-11[/sup]
49%    18.7683%    9.8215*10[sup]-9[/sup]
50%    50.0000%    8.0046*10[sup]-7[/sup]
51%    18.7500%    2.9309*10[sup]-5[/sup]
52%     4.6875%    4.8320*10[sup]-4[/sup]
53%     1.1719%    3.5881*10[sup]-3[/sup]
54%     0.2930%    1.1982*10[sup]-2[/sup]
55%     0.0732%    1.7929*10[sup]-2[/sup]
56%     0.0183%    1.1956*10[sup]-2[/sup]
57%     0.0046%    3.5260*10[sup]-3[/sup]
58%     0.0011%    4.5550*10[sup]-4[/sup]
59%     0.0003%    2.5469*10[sup]-5[/sup]
60%     0.0001%    6.0771*10[sup]-7[/sup]

So given that Pr(p*=X | 55% over 2,000) = Pr(p*=X) * Pr(55% over 2,000 | p*=X) / P(55% over 2,000) we can then determine our posterior distribution for true pick rate, p*:

CHART 3:

Code:

X    Pr(p*=X|55% over 2,000)
45%    0.0000%
46%    0.0000%
47%    1.2738*10[sup]-11[/sup]
48%    2.0875*10[sup]-8[/sup]
49%    0.0015%
50%    0.3303%
51%    4.5347%
52%    18.6903%
53%    34.6972%
54%    28.9661%
55%    10.8359%
56%    1.8064%
57%    0.1332%
58%    0.0043%
59%    6.0130*10[sup]-7[/sup]
60%    3.5868*10[sup]-9[/sup]

(This, btw, would imply an expected true pick probability of 53.2786% for this bettor.)

So to solve the problem we now need to calculate the probabilities of going 52% over one's next 200 picks given each possible pick probability:
Chart 4:

Code:

X    Pr(p*=X |55%)    Pr(<=52% over 200 | p*=X)
45%     0.0000%    98.0112%
46%     0.0000%    96.1719%
47%    1.2738E-11    93.1445%
48%    2.0875E-08    88.5482%
49%     0.0015%    82.1066%
50%     0.3303%    73.7689%
51%     4.5347%    63.7986%
52%    18.6903%    52.7822%
53%    34.6972%    41.5357%
54%    28.9661%    30.9299%
55%    10.8359%    21.6950%
56%     1.8064%    14.2748%
57%     0.1332%    8.7776%
58%     0.0043%    5.0265%
59%    6.0130E-07    2.6718%
60%    3.5868E-09    1.3140%

So taking the dot product (i.e., the sum of the products of each paired item) of the 2nd and 3rd vectors in chart 4 (and assuming I've not made any silly mistakes) yields a probabiliy of roughly 38.9947%, corresponding to US-style odds of about +156.4.

And for those of you who might comment that the achieved results were highly dependent on our estimates of our prior distribution, all I can really say is, yes, you're right. This is why coming up prior estimates (i.e., a baseline from which to interpret new information) is so incredibly important.

Thank you, much appreciated.

**raiders72002** · 01-24-09, 10:32 AM

Originally posted by Justin7

Are you assuming he will hit 54% going forward? We need more info.

no, all I had was the data posted.

**raiders72002** · 01-24-09, 03:29 PM

Pancho Sanza

Holy shit

He first solved for what the true pick rate would be, 53.2786, I just used 54 %

Weird though that he used 55 % over 2000 to crunch his numbers when you indicated it was 54 %. That likely altered the true pick rate calculation.

Also, at the end, he computes the probability of < 52 % over 200 plays, you asked for 500 plays.

Using his number of 53.2786 and 200 plays, the calculator spits out 38.48 %, very close to his number of 38.9947 %
__________________

**Ganchrow** · 01-24-09, 03:56 PM

Originally posted by Pancho Sanza

Holy shit

He first solved for what the true pick rate would be, 53.2786, I just used 54 %

Weird though that he used 55 % over 2000 to crunch his numbers when you indicated it was 54 %. That likely altered the true pick rate calculation.

Also, at the end, he computes the probability of < 52 % over 200 plays, you asked for 500 plays.

Using his number of 53.2786 and 200 plays, the calculator spits out 38.48 %, very close to his number of 38.9947 %

Yeah as Pancho indicates, I had misread your initial post to read 55% over his last 2,000 picks to solve for 52% over his next 200.

Using 54% and 500 plays, as well as the prior distribution presented above, yields a posterior probability of 46.4923%.

**Ganchrow** · 01-24-09, 04:08 PM

Originally posted by Pancho Snaza

He first solved for what the true pick rate would be, 53.2786, I just used 54 %

With due respect to Pancho, this is really not an accurate description of the methodology I used.

53.2786% does not represent the true pick probability, but rather the expected value of the posterior distribution of true pick probability and is, in general, insufficient information to solve problem of this sort (although for very well-behaved distributions, the results may be close).

It's a subtle difference but important for anyone looking to successfully utilize Bayesian inference.

The true pick probability is an unknown population paramter, while the posterior distribution of true pick probability is a distribution formulated from an imputed prior distribution along with additional evidence (given the data I had used, 55% over the last 2,000 picks).

**accuscoresucks** · 01-24-09, 04:56 PM

my brain is whacked

**Arilou** · 01-26-09, 09:20 AM

Ganchrow's chart 1 is the key question here. If that is correct, then the rest follows, so the real debate is what that chart should look like! What we have here is essentially a symmetrical chart centered around 50%, with about 1.5% of people hitting 53% (and another 1.5% doing so badly they could get there by fading themselves). I think it's not a crazy first guess if you're drawing from the pool of all players and not selecting for who is tracking or posting their plays, but I'd have to make some adjustment to account for the fact that centering on 50% is too high for such a pool and another for the fact that extreme winners are more likely relative to extreme losers than the mean of the curve would suggest.

**Ganchrow** · 01-26-09, 09:54 AM

Originally posted by Arilou

Ganchrow's chart 1 is the key question here. If that is correct, then the rest follows, so the real debate is what that chart should look like! What we have here is essentially a symmetrical chart centered around 50%, with about 1.5% of people hitting 53% (and another 1.5% doing so badly they could get there by fading themselves). I think it's not a crazy first guess if you're drawing from the pool of all players and not selecting for who is tracking or posting their plays, but I'd have to make some adjustment to account for the fact that centering on 50% is too high for such a pool and another for the fact that extreme winners are more likely relative to extreme losers than the mean of the curve would suggest.

You are indeed correct. The trick is determining a reasonable prior distribution, which was in fact the purpose of my earlier post http://forum.sbrforum.com/handicappe...ick-rates.html.

I will point out, however, that for what it's worth, regardless of the imputed prior (just so long as it's continuously defined with finite variance), then the posterior will asymptotically approach the true pick rate distribution. What this means is that as long as the selected prior has a reasonable mean, standard dev., and shape to it, then with a decent sample size you'll find "decent" approximations even if not positively accurate answers.

**Scorpion** · 01-31-09, 02:38 PM

Originally posted by accuscoresucks

my brain is whacked

That what happens to me every time I read Ganch posts
Put him on your ignore list!

))