Adjusting z score for assumed edge

**Justin7** · 07-23-10, 06:38 PM

I would be very careful using this methodology. As a starting point, I would back test your model (for plays that you didn't make) to get your sample size up a lot further.

There are other problems also. Is your bet size the same for each bet?

Z-score versus breakeven is a starting point. If you're +2, that is "of interest". If you are doing the calculations right, you could adjust your breakeven to calculate your Z-scores at different EVs (as you suggested).

**hutennis** · 07-24-10, 10:52 AM

Originally posted by Justin7

I would be very careful using this methodology. As a starting point, I would back test your model (for plays that you didn't make) to get your sample size up a lot further.

There are other problems also. Is your bet size the same for each bet?

Z-score versus breakeven is a starting point. If you're +2, that is "of interest". If you are doing the calculations right, you could adjust your breakeven to calculate your Z-scores at different EVs (as you suggested).

TY for comment Justin.

My sample size is just about 200 at the moment. It's small, but correct me if I'm wrong, is not a confidence interval takes care of a sample size issue to a degree?

As far as bet sizes, my variance is calculated in a standardized units, not in dollars.
Size of each bet is determined by, let's call it, price coefficient.
If it is $10, for example, then I'll bet 10 to win 12 (+120) and risk 12 to win 10 (-120).

This way, regardless of how widely spread my bets are, by dividing bet size and profit/loss by corresponding price coefficient I get standardized units that I use to calculate variance.

As far as calculation itself, it goes as follows

variance = (bet_size)^2 * (decimal_odds - 1)
SD = sq. root of variance
Profit (in standardized units, NOT in $$)
Z Score = (profit - ev)/SD

So thats about it. I hope I'm right so far.

But all that is not what really concerns me though.

What REALLY concerns me, is an overall validity of making statistical arguments about SB results assuming that the Central Limit Theorem applies to them simply.

For example, I would never ever, ever put any trust in any z score of any model when it comes to financial markets regardless of the amount of back testing and/or sample size.
Bell curve just can't handle massive outliers that are so common in that extreme environment.
Events of October 87 would have a sigma of 20, according to "models". Thats like once in a few billion lives of the Universe or something like that.
Yes, Black-Scholes is a nifty-keen way to value options but nevertheless sucks big-time in valuing low-probability long-fat-tail catastrophic events.
And its Nobel Prize winning creator should know it by now.
That's how he lost billions. Twice.

Same thing with poker results. All those win rates in BB/100 over hundreds of K hands or whatever.
And all those proud owners of those win rates who assert flatly that this win rate and that variance mean that there is, simply an X% likelihood that the player in question is a winning player.
These statistics are a convenience, and they are suggestive and helpful, but they are not the truth.
Anyone who works with them as if they were the truth is eventually going to wind up with a lot of egg on their face trying to figure out how the hell this nasty downswing can be even happening to them if its probability, according to CLT, is 1 in 5 million hands played. And then it will happen again. Month and a half later.

So, since I'm very new to SB, what I really want to know is where am I?

Am I in an environment where "nature rules"?
And, if that's the case or reasonably close to being the case, I can have a certain confidence in my expectations as I would be in AA dealt or in determining defective parts on a production line or in establishing a life expectancy of a certain segments of the population.

Or am I in a neighborhood where human nature runs wild on a backdrop of hugely incomplete and constantly changing information? Then all those z score calculations are not too useful.

Personally, I'm some what optimistic, since odds in SB for the most part, would seem to be a very educated representation of human's relative natural abilities along with some other "natural" things and "things natural" are bell curve's specialty.

This optimistic point of view is the reason why my well above +2 score is "of such an interest" to me.

Any comments are appreciated.

**Justin7** · 07-24-10, 12:34 PM

I think sports has a better bell-curve than finances.

As I said, Z-score is a starting point. Another test I would use is the "line movement" test. How much did the line move from the time you recorded the bet to close? What is your median line move? If you're seeing a median line move of 2%, you probably are a very good player.

**MarketMaker** · 07-24-10, 01:42 PM

Justin how do you calculate 2% of line move? It seems obvious for totals but what about sides and ML?

**Justin7** · 07-24-10, 01:51 PM

Originally posted by MarketMaker

Justin how do you calculate 2% of line move? It seems obvious for totals but what about sides and ML?

Look at price.

If you made a bet at -110, and it closed at-120
This reflects a move of (-120/-220) - (-110/-210) = 2.16%. If you do that consistently, you're going to win.

**Ganchrow** · 07-24-10, 02:25 PM

Originally posted by hutennis

As far as calculation itself, it goes as follows

variance = (bet_size)^2 * (decimal_odds - 1)
SD = sq. root of variance
Profit (in standardized units, NOT in $$)
Z Score = (profit - ev)/SD

If your null hypothesis assumed % edge ≠ 0, then your formula for variance would be a bit off. The correct formula should be:

variance = (bet_size)² * (decimal_odds - 1 - Edge) * (1 + Edge)

Originally posted by hutennis

What REALLY concerns me, is an overall validity of making statistical arguments about SB results assuming that the Central Limit Theorem applies to them simply.

If you search the Think Tank forum you'll find a simple Monte Carlo simulation program I wrote in Perl. You could compare that to results obtained using a Z-Score.

If you're comfortable with programming, hacking together a VB script to traverse the entirety of the binomial outcome tree should be a straightforward exercise in combinatorics. Provided you had manageably few bet classes, this would represent a faster computation than a Monte Carlo Sim.

Originally posted by hutennis

For example, I would never ever, ever put any trust in any z score

That would generally be a wise decision.

In sports betting, one real problem with frequentist-style hypothesis tests and confidence intervals is that of selection bias.

One form of this is often seen when investigating multiple models. For example, if you intended to perform hypothesis tests on 10 (orthogonal) models then the probability of at least one null rejection at α = 5% would be 1-95%¹⁰ ≈ 40.1%.

To get this meta-Type I error probability down to the generally accepted level of 5%, you'd actually need to use a per-test value of α = 1 - ¹⁰√95% ≈ 0.512%.

And many other potential problems exist with this methodology taken as given. If you're interested, search the Think Tank (or inquire here) for some examples.

Oh, and do check out Bayesian inference as seasoning (or an alternative) to the frequentist flavors.

**Ganchrow** · 07-24-10, 02:31 PM

Originally posted by MarketMaker

Justin how do you calculate 2% of line move? It seems obvious for totals but what about sides and ML?

To really do this properly one would need to consider from the perspective of Bayesian inference.

I'll try to post a practical example of this soon.

In the meantime, my recent post on the topic of Bayesian inference might suggest a few reasonable paths.

**MarketMaker** · 07-24-10, 02:42 PM

Any chance you could use the first WNBA game as an example? Assume I bet Indiana +2.5 and the line is now Indiana -1. What would the line move be in terms of %?

**Ganchrow** · 07-24-10, 03:42 PM

Originally posted by Justin7

Look at price.

If you made a bet at -110, and it closed at-120
This reflects a move of (-120/-220) - (-110/-210) = 2.16%. If you do that consistently, you're going to win.

Even without getting all Bayesian, right off the bat I'd probably be more inclined to use ΔEdge calculated between the position open and market close implied probs. In practice this is nothing more than probability delta weighted by realized net decimal odds.

Provided it were performed with angelic good faith, further weighting each market by units wagered might make for a first-pass improvement.

Another direction to go might be to use logit^-1(Δlogit(implied prob)) as a drop-in for the mathematically problematic Δimplied prob, and then weighing each market by either, neither, or both of realized decimal odds and units wager sizes.

A third possibility might be found with ΔFull Kelly Utility, which in practice could yield a good compromise between the weighted probability delta method and the weighted logit^-1-logit-delta method.

Whatever method chosen, the "further analysis step" would likely involve the estimation of an appropriate p-value for the aggregate results. But additional thought still required -- notably wrt a reasonable volatility model. Any ideas?

Still, a Bayesian analysis would ultimately be the way to go.

**hutennis** · 07-25-10, 12:28 AM

Originally posted by Ganchrow

In sports betting, one real problem with frequentist-style hypothesis tests and confidence intervals is that of selection bias.

TY for you input Ganchrow. Very, very interesting.

Could you tell me, please, where selection bias can come from in my case.
I really don't "select" anything.
I just record results of a trades placed win lose or draw, and then i keep running total of calculations based on those results.
So, like i said, no selection process involved

Thanks again.

**Ganchrow** · 07-25-10, 02:55 PM

Originally posted by hutennis

Could you tell me, please, where selection bias can come from in my case.
I really don't "select" anything.
I just record results of a trades placed win lose or draw, and then i keep running total of calculations based on those results.
So, like i said, no selection process involved

Ostensibly you've selected your trades based upon some set of criteria.

If these criteria have not been set in stone since the beginning of time then there's room for selection bias.

**hutennis** · 07-25-10, 03:44 PM

Originally posted by Ganchrow

Ostensibly you've selected your trades based upon some set of criteria.

If these criteria have not been set in stone since the beginning of time then there's room for selection bias.

Ok, i see now

My selection criteria have been established prior to making my first trade and has never been touched since.
Every single bet out 208 bets so far has been placed using exactly the same set of rules and process.
I take a great care in making sure thats nothing is influencing or interfering with it.

What else do you think I should be looking for to make sure I'm not fooling myself with my results?

TY

**That Foreign Guy** · 05-21-11, 07:27 PM

Originally posted by Justin7

I think sports has a better bell-curve than finances.

I am almost certain it does. Sportsbetting outcomes are curtailed where finance ones often aren't. You can't have a negative black swan if downside risk is limited. LTCM would have been fine if they stuck to buying options.