Statistcs assistance needed

Ganchrow · 05-25-08, 09:57 AM

Originally posted by Bullajami

I am trying to figure the statistical relevance and reliability of a bet selection methodology, but I need some help.

All of the bets are on moneylines.

60 bets were made using 1% as the betting unit, and 33 bets were made using 2% as the betting unit. After 93 bets I am up 20%.

What, if any, conclusions can be made at his point about this strategy?

Thanks in advance for your help.

At what odds were each of the 93 bets placed?

When you refer to 1% as the "betting unit" is that a "to-risk" amount or a "to-win" amount?

Are you compounding your bets so that had you started with 100 units and then found yourself up 10 units, would 1% then imply a wager of 1 unit or of 1.1 units?

Bullajami · 05-25-08, 11:28 AM

Originally posted by Ganchrow

At what odds were each of the 93 bets placed?

When you refer to 1% as the "betting unit" is that a "to-risk" amount or a "to-win" amount?

Are you compounding your bets so that had you started with 100 units and then found yourself up 10 units, would 1% then imply a wager of 1 unit or of 1.1 units?

From -105 to +1000. Mostly around +130. Do you really need all 93 odds?

Betting unit is risked amount.

Yes, betting unit is based on compounding results.

Ganchrow · 05-25-08, 11:41 AM

Originally posted by Bullajami

Do you really need all 93 odds?

That depends ... do you really need an answer?

Bullajami · 05-25-08, 12:08 PM

Ganchrow · 05-25-08, 01:16 PM

Originally posted by Bullajami

-snip-

Which bets were at 1% and which were at 2%?

Bullajami · 05-25-08, 01:50 PM

Bottom 60 at 1%.

Ganchrow · 05-25-08, 05:55 PM

I'm coming up with a p-value of about 11.8%. This corresponds to the probability that a player not paying vig and with no edge would have achieved the same or better results strictly by chance.

Bullajami · 05-25-08, 06:15 PM

0.882 probability that my method is effective? Am I understanding you correctly?

Can you show me how it's calculated so that I can run the numbers again when I have more data?

Thank you for your assistance!

Ganchrow · 05-25-08, 06:57 PM

Originally posted by Bullajami

0.882 probability that my method is effective? Am I understanding you correctly?

There's an 88.2% probability that a bettor with no advantage and paying no vig would have achieved results worse than that you obtained.

That's slightly different than simply stating that there's an 88.2% probability that your method was effective.

Also ths conclusion takes your results in a vacuum,, meaning that it's assuming that this is the only strategy you're investigating. If these simply corresponds to but a single strategy out of many that just happened to be effective than the 88.2% figure would be too high. Perhaps drastically.

By tradition, 95% is generally considered the minimum desired level for statistical significance. Hence, even if the above were true these results would not be considered statistically significant (meaning that we could not reject the hypothesis were due to "luck". This does not imply the results were due to luck, just that a statistician would be unable to reject that possibility at the 95% confidence level.

Originally posted by Bullajami

Can you show me how it's calculated so that I can run the numbers again when I have more data?

Because of the compounding effect I simply ran a 5,000,000 trial Monte Carlo simulation, which is how I obtained the 11.8% value.

I need to run off to wife's birthday party now, but at some later point I'll write a post detailing the process involved. It's rather straightforward.

Bullajami · 05-25-08, 07:10 PM

I thank God that I have access to your insane genius.

Ganchrow · 05-25-08, 07:18 PM

Originally posted by Bullajami

I thank God that I have access to your insane genius.

God had exactly zero to do with it -- I made a pact with the devil years ago.

(That's why I have such bad breath.)

Bullajami · 05-25-08, 07:25 PM

Didn't Pascal call making a pact with the Devil a bad bet?

I should have studied more when I was younger, but I was always too horny.

I appreciate you sharing your scholarly knowledge, your Monte Carlo simulation software, and for using Altoids when you post.

Ganchrow · 05-26-08, 01:47 AM

We're assuming zero-vig so the expectation on each bet is zero.

The variance of a bet of size x at decimal odds of d is given by (d-1)*x². So for example the variance of a 1-unit bet at +130 would be 1² * (-1) = 1.3, while the variance of a 2-unit bet at -105 would be 2² * (-1) ≈ 3.8095.

If we ignore compounding the variance of a linear combination of independent bets is the sum of the individual variances. In the provided data set the total variance works out to be about 272.86 units². The standard deviation is the square root of the variance, which in this case is (272.86 units²)^½ ≈ 16.519 units.

The realized return is 20 16.510 ≈ +1.2108 standard deviations better than expected, implying a p-value of =NORMSDIST(1.2108) ≈ 88.701% from the normal distribution and =1-TDIST(1.2108,93-1,1) ≈ 88.545% from the t-distribution with 92 (i.e., 93 bets - 1) degrees of freedom (where NORMSDIST and TDIST refer to MS Excel's cumulative standard normal and Student t-distribution functions, respectively).

One problem with the above analysis, however, is that it completely discounts the effect of compounding. Nevertheless, because we're dealing with a relatively small number of bets, each at a relatively small percentage of bankroll, this effect is fairly small.

Taking compounding into account, the only real procedural difference would be in the calculation of variance. Because the result of previous bets impact the results of future bets (insofar as if we win an earlier bet we'd be betting more on a later bet it's no longer correct that the total variance equals the sum of the variances.

Rather than go into some long derivation, I'll just state the result here. The total variance is given by the sum of the variances plus the sum of the product of the variances taken 2 at a time plus the sum of the product of the variances taken 3 at a time ... plus the product of all N variances.

This obviously represents a huge number of terms (2^N - 1 terms for N bets, 9,903,520,314,283,042,199,192,993,792 terms for 93 bets), but if bets are sufficiently small we can approximate by only looking at terms up to only the second or third order. Given the 93 bets of the original problem, the total variance (up to the second order) would then be:

Total Variance ≈ Σ^N_i=1[σ²_i] + Σ^N_i=1[σ²_i * Σ^N_j=i+1[σ²_j ]]

which works out to a standard deviation of 16.629%, implying a t-distribution (with 92 degrees of freedom) p-value of 88.391%. Were we to go to the third order we'd of course find a slightly higher σ (~16.630%) and hence a slightly lower p-value (~88.390%).

This is faily close to the p-value of 88.215% that I obtained from my 40,000,000-trial Monte Carlo run. The difference stems from having approximated the distribution of results with the t-distribution.

Bullajami · 05-26-08, 08:08 AM

I truly appreciate your time on this.

If I had a larger sample size, what would the impact be? Just an estimate. For example, if I had 93 additional identical bets with identical results, for 186 total bets, what is the magnitude of impact on the p-value?

I am trying to get a feel for when I meet the threshold for statistical relevance (p-value of 5%)

VideoReview · 05-26-08, 09:53 AM

Originally posted by Bullajami

I appreciate you sharing your scholarly knowledge, your Monte Carlo simulation software,

Ganchrow, I am wondering if at some point if it would make sense to add your Monte Carlo script and input options (allow cut and paste) to the SBR betting tools? This could prove quite useful to the forum and save you a great deal of time in running simulations for everyone.

Justin7 · 05-26-08, 09:55 AM

Bullajami,

I'd be more interested in your methodology in selecting plays than your actual results.

Bullajami · 05-26-08, 10:30 AM

Originally posted by Justin7

Bullajami,

I'd be more interested in your methodology in selecting plays than your actual results.

Once it becomes statistically relevant I will post. Until then it's just another crazy idea of mine.

Ganchrow · 05-26-08, 11:30 AM

Originally posted by Bullajami

I truly appreciate your time on this.

If I had a larger sample size, what would the impact be? Just an estimate. For example, if I had 93 additional identical bets with identical results, for 186 total bets, what is the magnitude of impact on the p-value?

I am trying to get a feel for when I meet the threshold for statistical relevance (p-value of 5%)

If you included 93 additional out-of-sample bets, at the same odds and bet sizes as the initial 93, that yielded identical results (and I'm taking "identical results" to imply a return of 120%² - 1 = 44%), then this would result in a p-value of roughly 96.763%.

Bullajami · 08-06-08, 09:13 AM

At the risk of wearing out my welcome, I'd like to ask that this simulation be run again, please. I have increased the sample size to 201 bets, all on moneyline dogs. In reality I had a combination of 1% and 2% flat bets (risked amount), but I converted to a model where I used all 2% flat bets (risked). On this model I show an 83.002% ROI.
Can you tell me the p-value of my results? (or explain how I could calculate it myself in a language I understand?)

dcbt · 08-06-08, 10:47 AM

I'm glad this thread got revisited - I missed it the first time around, so it gave me a chance to set up a spreadsheet for my own testing. One question, though - when using 'rate of return,' are we talking about the traditional 'investing' rate of return - ie, if a bankroll goes from 1000 to 1200, that's a 20% ROR? Or, are you using it in the manner many gamblers do, that being a percentage of money bet? I assume the former, but just verifying... Thanks.

(By the way, I tested your revised data with my template and came up with 99.21% NORMSDIST. I *think* it's accurate, as I was able to tie to Ganchrow's calc in his prior post, so hopefully I've got my template set up right - we'll find out if/when he confirms.)

Bullajami · 08-06-08, 11:38 AM

Originally posted by dcbt

I'm glad this thread got revisited - I missed it the first time around, so it gave me a chance to set up a spreadsheet for my own testing. One question, though - when using 'rate of return,' are we talking about the traditional 'investing' rate of return - ie, if a bankroll goes from 1000 to 1200, that's a 20% ROR? Or, are you using it in the manner many gamblers do, that being a percentage of money bet? I assume the former, but just verifying... Thanks.

(By the way, I tested your revised data with my template and came up with 99.21% NORMSDIST. I *think* it's accurate, as I was able to tie to Ganchrow's calc in his prior post, so hopefully I've got my template set up right - we'll find out if/when he confirms.)

Gahhhhh! I was trying to be careful, too. But you are correct, the 83.002% figure is the growth of bankroll number, not the ROI number.

ROI based on total amount wagered is 15.94%

Good catch, thanks.

dcbt · 08-06-08, 12:07 PM

Originally posted by Bullajami

Gahhhhh! I was trying to be careful, too. But you are correct, the 83.002% figure is the growth of bankroll number, not the ROI number.

ROI based on total amount wagered is 15.94%

Good catch, thanks.

With the 15.94%, I get 67.84% NORMSDIST now.
Which one is supposed to be used for this calc, the amt based on total wagered, I take it?

Bullajami · 08-06-08, 12:15 PM

Originally posted by dcbt

With the 15.94%, I get 67.84% NORMSDIST now.
Which one is supposed to be used for this calc, the amt based on total wagered, I take it?

Ganch the Great never asked for my ROI based on total wagered for part one of this thread, so I tend to think (and hope!) its the 83% number.

Ganchrow · 08-07-08, 02:19 AM

Originally posted by Bullajami

Gahhhhh! I was trying to be careful, too. But you are correct, the 83.002% figure is the growth of bankroll number, not the ROI number.

ROI based on total amount wagered is 15.94%

Good catch, thanks.

I'm still a bit unclear as to your chosen terminology.

Given the list of 201 prices above, then at 2 units per wager standard deviation would be 34.41 units.

If you're saying you risked a flat bet of two units per wager (201×2 = 402 units) and then ended up 15.94%×402 = 64.08 units, then your p-value would be about 1-NORMSDIST(64.08/34.41) ≈ 1-96.87% ≈ 3.13%.

If OTOH, you're saying you risked a flat bet of two units per wager (201×2 = 402 units) and then ended up 15.92 units, then your p-value would be about 1-NORMSDIST(15.92/34.41) ≈ 1-67.84% ≈ 32.16%, which is exactly the same as dcbt's figure.

(Oh and btw, were you indeed +83.002 units as your earlier post may have mistakenly implied (or as dcbt and I may have misinterpreted), then dcbt's figure of 99.21% would have been correct).

The p-value of course, corresponds to the probability (from the normal distribution in these cases) that a player would no edge whatsoever would have obtained results at least this positive. Note that this implicitly assumes no data mining. For example, were I test to 100 different independent models, then we'd fully expect to see a model with a p-value of 1%. This, however, would by itself be of little predictive use.

Bullajami · 08-07-08, 05:23 AM

Thanks again. Sorry I have not been clear.

I am saying that the bet size was consistently 2% of bankroll, compounded after each result.

After betting on the 201 underdogs listed, the bankroll increased 83.002% from its original size.

The 15.94% number comes from dividing the dollar amount of bankroll growth by the total dollar amount wagered.

So, which p-value applies?

Or, if I have screwed up this terminology as well, what I am really after is some statistical evidence that my method of selecting these bets is sound.

Ganchrow · 08-07-08, 07:42 AM

Originally posted by Bullajami

Thanks again. Sorry I have not been clear.

I am saying that the bet size was consistently 2% of bankroll, compounded after each result.

After betting on the 201 underdogs listed, the bankroll increased 83.002% from its original size.

The 15.94% number comes from dividing the dollar amount of bankroll growth by the total dollar amount wagered.

So, which p-value applies?

Or, if I have screwed up this terminology as well, what I am really after is some statistical evidence that my method of selecting these bets is sound.

Let's start with an easy question.

What would your returns had been had you been flat betting?

Bullajami · 08-07-08, 08:39 AM

All bets identical size, 2% of original bankroll. After 201 bets, +38.83 units

Ganchrow · 08-07-08, 09:45 AM

Originally posted by Bullajami

All bets identical size, 2% of original bankroll. After 201 bets, +38.83 units

OK I'm still a bit unclear on your terminology. Is 1 unit in this example 1% of initial bankroll or 2%?

So in other words, after 201 flat bets of let's say 1 unit each, how many units did you wind up?

Bullajami · 08-07-08, 10:03 AM

Hypothetical bankroll starting = $500
Betting unit $10 on every bet
201 bets, odds as posted above
Final bankroll is $888.30
+38.83 betting units

Ganchrow · 08-07-08, 10:21 AM

Originally posted by Bullajami

Hypothetical bankroll starting = $500
Betting unit $10 on every bet
201 bets, odds as posted above
Final bankroll is $888.30
+38.83 betting units

So flat betting 1 unit per bet over the 201 listed bets, your standard deviation would be about 17.20 units.

Since your return over the same period was +38.83 units, this would imply a standard score for of 38.3 17.20 ≈ 2.26.

This corresponds to a (one-tailed Gaussian) p-value of =1-NORMSDIST(2.26) ≈ 1.200%, which, if these results were strictly obtained out-of-sample, would certainly cause me to sit up and take notice.

Bullajami · 08-07-08, 10:36 AM

Originally posted by Ganchrow

So flat betting 1 unit per bet over the 201 listed bets, your standard deviation would be about 17.20 units.

Since your return over the same period was +38.83 units, this would imply a standard score for of 38.3 17.20 ≈ 2.26.

This corresponds to a (one-tailed Gaussian) p-value of =1-NORMSDIST(2.26) ≈ 1.200%, which, if these results were strictly obtained out-of-sample, would certainly cause me to sit up and take notice.

Ummmm...
Thank you very much for all of your help, but I would bet (full Kelly) that anyone who knows what you just wrote could probably have done the math for themselves.

Pretend for a minute that the tank on my toilet is not stacked with calculus text books. In fact, just go ahead and presume that I have not opened a math book of any kind for more than 2 decades.

Does my data support a picking system that should be profitable in the long term?

dcbt · 08-07-08, 10:47 AM

.

Ganchrow · 08-07-08, 12:34 PM

Originally posted by Bullajami

Ummmm...
Thank you very much for all of your help, but I would bet (full Kelly) that anyone who knows what you just wrote could probably have done the math for themselves.

Pretend for a minute that the tank on my toilet is not stacked with calculus text books. In fact, just go ahead and presume that I have not opened a math book of any kind for more than 2 decades.

Does my data support a picking system that should be profitable in the long term?

Perhaps I didn't understand your initial question.

You asked:

Originally posted by Bullajami

Can you tell me the p-value of my results?

The answer, (i.e., the p-value), as I gave in my previous post, is 1.200%. YOu can feel free to ignore the rest.

Is that value generally considered significant? Yes, very much so.

But the real question isn't so much the raw value as how these numbers were obtained.

For example if you looked at 1,000 different strategies, you'd expect, purely by chance, to find about 12 strategies of significance that great (p-values ~ 1.2%). Imputing too much meaning in these results is known as data dredging or sometimes data mining and carries with it a very negative connotation. (Indeed it's been the ruin of many a poor boy and God I know I'm one ...)

The point is that if you came up with a single theory, tested it on a data set different from the one used to postulate said theory then these results would most definitely be worth pursuing. To the degree that these results are indeed the result of data dredging I'd be increasingly cautious of its conclusions.

Anyway, I've made quite a few posts on the subject of data mining and in-sample vs. out-of-sample testing (many of them in response to questions posed by poster VideoReview) for which if you search around a bit you should be able to find.

Let me know if this makes sense and/or if you have further questions.

Bullajami · 08-07-08, 12:57 PM

Originally posted by Ganchrow

Perhaps I didn't understand your initial question.

You asked:The answer, (i.e., the p-value), as I gave in my previous post, is 1.200%. YOu can feel free to ignore the rest.

Is that value generally considered significant? Yes, very much so.

But the real question isn't so much the raw value as how these numbers were obtained.

For example if you looked at 1,000 different strategies, you'd expect, purely by chance, to find about 12 strategies of significance that great (p-values ~ 1.2%). Imputing too much meaning in these results is known as data dredging or sometimes data mining and carries with it a very negative connotation. (Indeed it's been the ruin of many a poor boy and God I know I'm one ...)

The point is that if you came up with a single theory, tested it on a data set different from the one used to postulate said theory then these results would most definitely be worth pursuing. To the degree that these results are indeed the result of data dredging I'd be increasingly cautious of its conclusions.

Anyway, I've made quite a few posts on the subject of data mining and in-sample vs. out-of-sample testing (many of them in response to questions posed by poster VideoReview) for which if you search around a bit you should be able to find.

Let me know if this makes sense and/or if you have further questions.

Muchas gracias, Senor.

I think I get it now. Once again, thank you for the noggin loan and the patience.

This was, in fact, out-of-sample data based on a theory/hunch that I had from previous betting experiences. Not the result of data mining, just some casual human observations I had made along the way. I partitioned off an inconsequential part of my BR and started betting my hunch/theory. Once it looked like the results were promising, I decided I had better try to prove it wasn't a fluke.

Now that I have a system that seems likely to provide an edge (p-values ~ 1.2%) I believe the next step for me would be to quantify that edge so that I can properly size bets for maximum BR growth. Can I find an aggregate edge and apply it to all bets or will I be risking too much value by not establishing this for individual bets?

Finally, where's the rosetta stone for Ganchrow posts? (As in, what reference(s) do I need to comprehend and learn this for myself?)