reduced kelly bet sizing (received via PM)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Ganchrow
    SBR Hall of Famer
    • 08-28-05
    • 5011

    #1
    reduced kelly bet sizing (received via PM)
    Originally posted by 8lrr8
    Ganchrow,

    i was playing around w/ your kelly calc, and noticed something strange:

    suppose my winrate is 59%, all at -113 odds. at full kelly, the median bankroll (if i start w/ $10k) after 600 bets is $731k.

    i've heard that if one bets half kelly, one gets (in theory) 75% of the full kelly's return (based on median bankroll, and not average BR). similarly, if one bets 70% of full kelly, one's (median) return is ~90% of full kelly. but when i input 0.7 and 0.5 for 70% and half kelly (respectively) into the calc, the results are very different.

    instead of expecting a median BR (after 600 bets) of 658k and 548k for 70% and 50% kelly betting (respectively), the calculator gives me a median BR of 498k (for 70% kelly) and a median bankroll of 251k (for half-kelly).

    what's the explanation for this? have i been misinformed about the expected median return for half-kelly?
    As a general rule it is indeed true that with the edges and odds one is likely to encounter in sports betting, the expected growth rate of half-Kelly and 70%-Kelly correspond to approximately 75% and 90% that of full-Kelly (respectively). You should note that these approximations will break down drastically at the extremes.

    So why the discrepancy after 600 bets?

    The two approximations you've noted apply to (geometric) average growth rates, while the median bankroll is a compounded growth figure. As Albert Einstein allegedly quipped, “The most powerful force in the universe is compound interest.”

    Now while Einstein's authorship of the above statement is dubious, there's no question that the effect of compound interest over a large number of trials can be substantial.

    So let's look at your example above:
    • US Odds: -113
    • Win Prob: 59%
    • Bankroll: $10,000.000
    • Trials: 600


    At full-Kelly:
    • Stake: $1,267.000
    • Expected Growth: $71.807 (71.807/10,000 = 0.71807%)
    • Median bankroll after 600 trials: $731,869.998 &asymp; (1+0.71807%)<sup>600</sup> (slight difference due to rounding)


    At 70%-Kelly:
    • Stake: $891.031
    • Expected Growth: $65.375 (65.375/10,000 = 0.65375%)
    • Median bankroll after 600 trials: $498,855.372 &asymp; (1+0.65375%)<sup>600</sup> (slight difference due to rounding)


    At 50%-Kelly:
    • Stake: $638.125
    • Expected Growth: $53.905 (53.905/10,000 = 0.53905%)
    • Median bankroll after 600 trials: $251,696.177 &asymp; (1+0.53905%)<sup>600</sup> (slight difference due to rounding)


    So indeed what we see is that 50%-Kelly expected growth is about 75% of full Kelly growth (0.53905% / 0.71807% = 75.069%), and that 70%-Kelly growth is about 90% of full Kelly growth (0.65375% / 0.71807% = 91.043%).
  • VideoReview
    SBR High Roller
    • 12-14-07
    • 107

    #2
    Help With Kelly

    Everyone: sorry about posting this in the public forum but I could not find a way to PM Ganchrow so I thought I would reply to a seldom looked at post that had no replies anyway.

    Hello Ganchrow.

    All though I have never posted yet, I have been actively lurking on SBR since about March and am a bit of a fan of your posts.

    I am sure you must get many emails from people like myself who know just enough about math to make them dangerous. Here is another one.

    I have read your posts on Kelly. I have also read the popular book and the original paper and several websites on the topic.

    I keep coming back to the same problem. How do I calculate the sampling error and factor this into the Kelly formula of (bp-q)/b?

    I have been assuming that .98/sqrt(n) will give me the percentage I need to subtract. What I have been doing is subtracting .98/sqrt(n) from sqrt(R^2) in my excel regression analysis. For example, if R^2=.05 and I have 100 samples then p=sqrt(.05)-.98/sqrt(n)=.1267067.

    Here is a real life scenario. I have 2 variables that predict 1 result. The result (shown as RESULT below) is the fair odds (no vig) NHL moneyline from the closing price from Pinnacle. I have 144 samples representing 72 different games (home and away are each given a separate line). Here are the regression results:

    Regression of variable RESULTS:

    Goodness of fit statistics:

    Observations 144.000
    Sum of weights 144.000
    DF 141.000
    R² 0.088
    Adjusted R² 0.075
    MSE 1.040
    RMSE 1.020
    MAPE 92.490
    DW 1.808
    Cp 3.000
    AIC 8.589
    SBC 17.498
    PC 0.951


    Analysis of variance:

    Source DF Sum of squares Mean squares F Pr > F
    Model 2 14.089 7.044 6.775 0.002
    Error 141 146.612 1.040
    Corrected Total 143 160.701
    Computed against model Y=Mean(Y)


    Model parameters:

    Source Value Standard error t Pr > |t| Lower bound (95%) Upper bound (95%)
    Intercept 5.872 2.688 2.185 0.031 0.559 11.185
    AH -11.700 3.186 -3.673 0.000 -17.998 -5.402
    N 5.851 2.519 2.322 0.022 0.870 10.831


    Equation of the model:

    RESULTS = 5.87156995163331-11.7001702939112*AH+5.85070213587917*N

    My first question, which is really just curiosity, is do you think the quality of these results justify continuing to develop this model?

    My second question is how can I determine the correct maximum Kelly for a desired confidence level (I have set the regression to 95% and have also done this in .98/sqrt(n) I believe) using the above numbers? Just in case, I have pasted below 2 columns of numbers you may need. I have sorted the list from the highest level of prediction to the lowest. These predictions are NOT the predictions from the initial regression. Using a program called XLStat, I ran the regression analysis using the model on all of the samples but one (actually two - both home and away for the single game were removed). I then used the regression to predict the results of the one sample that was not included. In my mind, I was removing any bias because the event that was removed was entirely independent from the events which were used to predict it. I then proceeded to do the same for every one of the 72 events. In this way, I was able to gain 144 independent predictions. I do not know if this is good statistical practice but it was the best I could come up with on a small sample. Here are the results:

    Actual (Not Fair)
    Predict Fair Result Pin Close Odds
    Obs168 0.787 -1 195 0.338983
    Obs134 0.725 1.5483142 151 0.398406
    Obs83 0.645 0.8303939 -126 0.557522
    Obs174 0.628 1.2434783 120 0.454545
    Obs71 0.579 1.3416667 130 0.434783
    Obs32 0.547 0.8041958 -130 0.565217
    Obs79 0.527 0.911983 -115 0.534884
    Obs69 0.499 -1 -140 0.583333
    Obs109 0.495 2.2391716 218 0.314465
    Obs13 0.482 0.5761317 -180 0.642857
    Obs78 0.446 -1 108 0.480769
    Obs169 0.434 1.1062963 106 0.485437
    Obs128 0.421 -1 158 0.387597
    Obs136 0.42 -1 172 0.367647
    Obs4 0.412 1.174843 113 0.469484
    Obs34 0.387 -1 -170 0.62963
    Obs9 0.376 1.4104858 137 0.421941
    Obs74 0.374 0.7917008 -132 0.568966
    Obs186 0.361 -1 122 0.45045
    Obs73 0.345 0.9815846 -107 0.516908
    Obs70 0.345 -1 -153 0.604743
    Obs64 0.342 -1 103 0.492611
    Obs77 0.338 -1 108 0.480769
    Obs59 0.337 -1 132 0.431034
    Obs105 0.316 1.4104858 137 0.421941
    Obs55 0.312 0.75642 -138 0.579832
    Obs1 0.306 -1 -142 0.586777
    Obs135 0.304 1.0965116 105 0.487805
    Obs104 0.285 1.3416667 130 0.434783
    Obs130 0.281 1.2925532 125 0.444444
    Obs53 0.262 1.4104858 137 0.421941
    Obs35 0.257 0.6183093 -168 0.626866
    Obs7 0.256 0.6670836 -156 0.609375
    Obs60 0.255 -1 -151 0.601594
    Obs163 0.249 1.4695257 143 0.411523
    Obs57 0.246 0.6108597 -170 0.62963
    Obs91 0.244 0.7620824 -137 0.578059
    Obs124 0.235 1.5089105 147 0.404858
    Obs147 0.227 0.7345799 -142 0.586777
    Obs98 0.224 -1 186 0.34965
    Obs156 0.216 -1 130 0.434783
    Obs68 0.215 0.8959908 -117 0.539171
    Obs95 0.213 1.5483142 151 0.398406
    Obs127 0.209 1.637037 160 0.384615
    Obs100 0.201 -1 146 0.406504
    Obs137 0.192 -1 125 0.444444
    Obs149 0.185 -1 105 0.487805
    Obs181 0.183 1.3023729 126 0.442478
    Obs131 0.182 -1 250 0.285714
    Obs99 0.17 1.1062963 106 0.485437
    Obs138 0.168 2.0821118 202 0.331126
    Obs84 0.165 -1 -134 0.57265
    Obs67 0.164 -1 -160 0.615385
    Obs42 0.162 -1 -115 0.534884
    Obs90 0.161 0.4628331 -230 0.69697
    Obs58 0.159 0.7620824 -137 0.578059
    Obs61 0.143 -1 -105 0.512195
    Obs180 0.134 1.331841 129 0.436681
    Obs108 0.127 -1 175 0.363636
    Obs185 0.109 1.1258716 108 0.480769
    Obs8 0.1 0.7917008 -132 0.568966
    Obs40 0.096 0.7855963 -133 0.570815
    Obs43 0.096 0.5696509 -182 0.64539
    Obs14 0.096 0.8372093 -125 0.555556
    Obs103 0.093 1.4104858 137 0.421941
    Obs126 0.092 1.3613223 132 0.431034
    Obs65 0.088 0.8583359 -122 0.54955
    Obs86 0.085 0.7040654 -148 0.596774
    Obs85 0.085 -1 -174 0.635036
    Obs132 0.068 -1 125 0.444444
    Obs107 0.06 -1 115 0.465116
    Obs82 0.044 -1 -155 0.607843
    Obs145 0.043 -1 110 0.47619
    Obs16 0.042 -1 -238 0.704142
    Obs148 0.036 -1 128 0.438596
    Obs150 0.024 -1 160 0.384615
    Obs45 0.009 -1 -222 0.689441
    Obs11 0.005 -1 -140 0.583333
    Obs10 0.001 -1 -147 0.595142
    Obs63 -0.002 0.7453416 -140 0.583333
    Obs178 -0.007 1.6764964 164 0.378788
    Obs184 -0.02 -1 127 0.440529
    Obs38 -0.033 0.3915344 -270 0.72973
    Obs5 -0.038 0.5193835 -206 0.673203
    Obs170 -0.06 0.8882008 -118 0.541284
    Obs157 -0.064 0.9285496 -113 0.530516
    Obs36 -0.066 -1 -139 0.58159
    Obs183 -0.067 -1 210 0.322581
    Obs6 -0.067 -1 -116 0.537037
    Obs96 -0.069 0.9724972 -108 0.519231
    Obs3 -0.076 -1 -102 0.50495
    Obs33 -0.084 -1 -142 0.586777
    Obs39 -0.098 0.7736626 -135 0.574468
    Obs179 -0.111 -1 138 0.420168
    Obs177 -0.114 1.282735 124 0.446429
    Obs165 -0.134 -1 169 0.371747
    Obs160 -0.16 1.5384615 150 0.4
    Obs97 -0.16 -1 -123 0.55157
    Obs161 -0.17 -1 107 0.483092
    Obs166 -0.175 -1 -103 0.507389
    Obs173 -0.177 0.65 -160 0.615385
    Obs94 -0.19 1.3613223 132 0.431034
    Obs125 -0.201 -1 120 0.454545
    Obs153 -0.202 1.4498406 141 0.414938
    Obs93 -0.209 0.7917008 -132 0.568966
    Obs15 -0.228 0.5601966 -185 0.649123
    Obs72 -0.229 0.5794272 -179 0.641577
    Obs129 -0.23 1.331841 129 0.436681
    Obs162 -0.235 1.3416667 130 0.434783
    Obs31 -0.243 -1 -157 0.610895
    Obs155 -0.243 1 -105 0.512195
    Obs62 -0.245 -1 -105 0.512195
    Obs154 -0.246 1 -105 0.512195
    Obs87 -0.256 -1 -139 0.58159
    Obs152 -0.259 0.7345799 -142 0.586777
    Obs175 -0.261 1.4892157 145 0.408163
    Obs151 -0.262 -1 127 0.440529
    Obs88 -0.268 -1 -136 0.576271
    Obs159 -0.271 0.903917 -116 0.537037
    Obs101 -0.273 -1 122 0.45045
    Obs102 -0.276 -1 -147 0.595142
    Obs56 -0.277 0.911983 -115 0.534884
    Obs37 -0.28 -1 -135 0.574468
    Obs52 -0.316 0.8730159 -120 0.545455
    Obs92 -0.321 -1 -118 0.541284
    Obs106 -0.321 -1 170 0.37037
    Obs80 -0.325 -1 150 0.4
    Obs76 -0.328 -1 -116 0.537037
    Obs2 -0.337 -1 -161 0.616858
    Obs172 -0.353 -1 105 0.487805
    Obs133 -0.371 -1 123 0.44843
    Obs146 -0.389 -1 -147 0.595142
    Obs167 -0.395 -1 122 0.45045
    Obs75 -0.412 0.4966496 -215 0.68254
    Obs158 -0.42 -1 112 0.471698
    Obs54 -0.436 -1 132 0.431034
    Obs66 -0.439 -1 106 0.485437
    Obs176 -0.443 -1 116 0.462963
    Obs12 -0.452 -1 -147 0.595142
    Obs44 -0.484 0.7736626 -135 0.574468
    Obs41 -0.543 -1 -161 0.616858
    Obs81 -0.617 -1 -130 0.565217
    Obs164 -0.755 -1 -140 0.583333
    Obs171 -1.165 0.8882008 -118 0.541284


    Regarding the above results, it appears to me that there is a good degree of correlation. BTW, I have noticed that the results did not seems to be randomly distributed. For example, having 9 losses in a row and then another 11 losses in a row later on seems highly improbable in a list as short as this. I noticed as well that if I grouped the prediction column into weighted quartiles (I sum up the column from top to bottom until I reach 1/4 of the total value of the column and this gives me my top quartile, and then go on to the next 25% etc.), that it seems to predict the exact spot where the results start to change dramatically. I even did this for 1/8's and it also worked almost perfectly. In fact, the 1/8 average results (from the top down) are:
    81.14%
    20.89%
    23.83%
    16.18%
    00.62%
    12.03%
    -88.48%
    -47.68%

    I do not know if this is a coincidence.

    If you want me to email you an excel spreadsheet with the data so you can work with it easier just let me know.

    Thanks for looking at this and for all your advice in the forum.

    VideoReview

    PS If you think this discussion is better done as a PM, let me know how and maybe delete this irrelevant (to the thread) post. Thanks.
    Comment
    • RickySteve
      Restricted User
      • 01-31-06
      • 3415

      #3
      Miniscule sample.

      Back-fitting.

      Trying to beat the market with widely available information.
      Comment
      • Ganchrow
        SBR Hall of Famer
        • 08-28-05
        • 5011

        #4
        Originally posted by VideoReview

        -snipped message-
        Your question actually has little to do with Kelly per se. Given stated payout odds, then the mapping of probability to Kelly stake will be injective (i.e., one-to-one) for all positive expectation. As such, once you determine a confidence interval for your forecast probability, converting that to a confidence interval for Kelly is trivial.

        I'm a little unclear as to why you're using R2 as an estimator of probability. Loosely speaking, the R2 of a model corresponds to the percent of the variability of your data set that's explained by that model. Unless I've misunderstood the nature of your regression, that's going to be very different from the win probability you're attempting to estimate.

        Typically when trying to estimate a probability (which is obviously only defined on the interval [0,1]) using regression analysis one uses the logarithim of the inverse of "fair" payout odds (i.e., "fair" decimal odds - 1, or b in your Kelly equation given an edge of 0) as the dependent variable, which is defined across the entire set of real numbers. This is know as a "logistic regression".

        It's also not strictly correct to use ±0.98/sqrt(n) as your interval. While this does correspond to a 95% confidence interval, this would only really be strictly true were your data set drawn from a single binomial distribution with p=50% (.98 = 2*50%*(1-50%)*1.96). In the context of your particular problem the confidence interval would be better expressed using the standard error of the regression.

        Specific mechanics aside I suspect I'd probably tend to agree with RickySteve in his analysis. That said, why don't you e-mail me your spreadsheet along with a description of each of the columns and we can take it from there.
        Comment
        • roasthawg
          SBR MVP
          • 11-09-07
          • 2990

          #5
          Originally posted by RickySteve
          Miniscule sample.

          Back-fitting.

          Trying to beat the market with widely available information.
          yeah but unless you have some sort of inside info then you'll always be trying to beat the market with widely available information. and as far as back-fitting, couldn't any analysis of past results be accused of the same thing?
          Comment
          • Data
            SBR MVP
            • 11-27-07
            • 2236

            #6
            Originally posted by VideoReview
            Regression of variable RESULTS:

            Goodness of fit statistics:

            Observations 144.000
            Sum of weights 144.000
            DF 141.000
            R² 0.088
            Adjusted R² 0.075

            My first question, which is really just curiosity, is do you think the quality of these results justify continuing to develop this model?
            This seems like a first step of a long journey.


            Regarding the above results, it appears to me that there is a good degree of correlation.
            You do not need to guess here. The correlation between the actual results and variable RESULTS is 0.297 as it is just the positive square root of R-squared. While 0.297 is sure noticeable enough and certainly can be used for forecasting I would venture a guess that linesmakers model produces much better results. Therefore, while you can make predictions that are much better than random you are currently unable to make predictions better than linesmakers do and therefore cannot beat the line with your model in its current state.
            Comment
            • VideoReview
              SBR High Roller
              • 12-14-07
              • 107

              #7
              How do I email you Ganchrow?
              Comment
              • Ganchrow
                SBR Hall of Famer
                • 08-28-05
                • 5011

                #8
                E-mail address is in my profile.
                Comment
                • Ganchrow
                  SBR Hall of Famer
                  • 08-28-05
                  • 5011

                  #9
                  Originally posted by Data
                  You do not need to guess here. The correlation between the actual results and variable RESULTS is 0.297 as it is just the positive square root of R-squared. While 0.297 is sure noticeable enough and certainly can be used for forecasting I would venture a guess that linesmakers model produces much better results. Therefore, while you can make predictions that are much better than random you are currently unable to make predictions better than linesmakers do and therefore cannot beat the line with your model in its current state.
                  I wouldn't even go that far at this point, because as it stands the model most likely isn't properly specified and so I suspect the R2 might be less meaningful than it appears.

                  I think he's probably going need to redo this as a logistic regression and then take it from there.
                  Comment
                  • VideoReview
                    SBR High Roller
                    • 12-14-07
                    • 107

                    #10
                    Originally posted by Ganchrow
                    Your question actually has little to do with Kelly per se. Given stated payout odds, then the mapping of probability to Kelly stake will be injective (i.e., one-to-one) for all positive expectation. As such, once you determine a confidence interval for your forecast probability, converting that to a confidence interval for Kelly is trivial.

                    I am having trouble understanding what you mean. Perhaps you could explain using the following simple example.

                    Assume I have the following random sample (That is, it was hypothesized beforehand without looking at the data, and these are the results):

                    2500 games win at +140
                    2500 games win at +120
                    5000 games lose at -100

                    My win probability is .50, my lose probability is .50, my average odds payout is 1.30. My ROI is .15, for interests sake.

                    Therefore, Kelly = ((.50*1.30)-.50)/1.3 = .1153846

                    I have 2 questions.

                    How do I factor in the 10,000 event sample size to calculate for various confidence levels (e.g. 95%)?

                    Is a logistic regression of the inverse of fair payout odds required when I am not trying to determine a probability between 0 and 1 but a return (or maybe I shouldn't be think along those lines at all)?

                    If I had a similar ROI of .15 but with a 50/50 outcome such as:
                    5000 games win at +130
                    5000 games lose at -100
                    The way I would calculate Kelly at 95% confidence would be:
                    ((.50-(.98/sqrt(10000))*1.3-(.5+(.98/sqrt(10000))))/1.3 = .098046

                    Is the above calculation correct?

                    VideoReview
                    Comment
                    • Ganchrow
                      SBR Hall of Famer
                      • 08-28-05
                      • 5011

                      #11
                      Originally posted by VideoReview
                      Assume I have the following random sample (That is, it was hypothesized beforehand without looking at the data, and these are the results):

                      2500 games win at +140
                      2500 games win at +120
                      5000 games lose at -100

                      My win probability is .50, my lose probability is .50, my average odds payout is 1.30. My ROI is .15, for interests sake.

                      Therefore, Kelly = ((.50*1.30)-.50)/1.3 = .1153846
                      If you don't mind making a bunch of simplifying assumptions here's a simple way of approximating this:

                      If we take the 15% return as an unbiased indicator of true population edge (which we assume doesn't vary with payout odds) and further assume that half the sample was at +120 and the other half at +140, then the sample variance for unit-risk bets for each of the two odds classes would be (5,000 * (1+edge) * (payout_odds - edge)).

                      Hence the total std. dev. of our edge estimate (assuming betting to win equal quantities) for unit-win bets would then be: SQRT(5000*1.15*((1.4-0.15)/1.4^2+(1.2-0.15)/1.2^2))/(5000/1.4+5000/1.2) ≈ 1.146%.

                      Appealing to the central limit theorem and assuming edge doesn't vary with payout odds, your 95% confidence interval for edge would then be about 15% ± 2.246% (If you wanted to be more a bit more accurate you'd probably want to use either a Weibull or lognormal distribution).

                      So at +120 your full-Kelly 95% confidence interval would be about (10.629%, 14.371%).

                      And at +140 your full-Kelly 95% confidence interval would be about (9.110%, 12.318%).

                      Originally posted by VideoReview
                      If I had a similar ROI of .15 but with a 50/50 outcome such as:
                      5000 games win at +130
                      5000 games lose at -100
                      The way I would calculate Kelly at 95% confidence would be:
                      ((.50-(.98/sqrt(10000))*1.3-(.5+(.98/sqrt(10000))))/1.3 = .098046

                      Is the above calculation correct?
                      As long as you're comfortable appealing to the Central Limit Theorem, then yes.
                      Comment
                      • VideoReview
                        SBR High Roller
                        • 12-14-07
                        • 107

                        #12
                        Originally posted by Ganchrow
                        So at +120 your full-Kelly 95% confidence interval would be about (10.629%, 14.371%).

                        And at +140 your full-Kelly 95% confidence interval would be about (9.110%, 12.318%).

                        As long as you're comfortable appealing to the Central Limit Theorem, then yes.
                        I assume the 10.629% and 9.110% are the full-Kelly for the different odds levels. What do the 14.371% and 12.318% mean?

                        Also, could you write out the full-Kelly equation for each of these 2 odds levels so I can see how you get from .15+/-.01146 to the 10.629% and 9.110% respectively.
                        Comment
                        • Ganchrow
                          SBR Hall of Famer
                          • 08-28-05
                          • 5011

                          #13
                          Originally posted by VideoReview
                          I assume the 10.629% and 9.110% are the full-Kelly for the different odds levels. What do the 14.371% and 12.318% mean?

                          Also, could you write out the full-Kelly equation for each of these 2 odds levels so I can see how you get from .15+/-.01146 to the 10.629% and 9.110% respectively..
                          Those are 95% confidence intervals. The two numbers refer to the upper and lower bounds respectively.

                          The confidence interval for the edge is 15% ± 2.246% (as 2.246% ≈ 1.96*1.146%). So the lower interval bound for payout odds of w would just be (15% - 2.246%)/w.
                          Comment
                          • VideoReview
                            SBR High Roller
                            • 12-14-07
                            • 107

                            #14
                            Originally posted by Ganchrow
                            Hence the total sample std. dev (assuming betting to win equal quantities) would then be the sqrt of the sum of the two variances weighted by the inverse of payout odds sqrt(5000 * ( 1.15/1.4*(1.4-0.15) + 1.15/1.2*(1.2-0.15) ) / (1/1.2+1/1.4))/5000 ≈ 1.621%.
                            If I understand correctly, if all of the initial assumptions (15% true edge, etc.) are true and I have the following distribution:
                            2000 wins at +120
                            2000 losses at -100 (odds were +120)
                            3000 wins at +140
                            3000 losses at -100 (odds were +140)
                            The total sample standard deviation (assuming to win equal quantities as you suggested) would be:
                            SQRT((6000*1.15/1.4*(1.4-0.15)+4000*1.15/1.2*(1.2-0.15) )/(1/1.2+1/1.4))/5000 = 1.6225%

                            From what I can figure, the "/5000" at the end of the equation represents the total number of bets won which is being used because of the weighting inside of the equation.

                            Assuming my proposed formula changes for the new distribution are accurate then I am able to calculate the sample standard deviation with a variety of payout odds (again, assuming I have an unbiased indicator of true population edge) as follows:

                            sqrt ( ((1+edge) * (payout_odds_1 - edge) + (1+edge) * (payout_odds_2 - edge) + (1+edge) * (payout_odds_3 - edge) + ....) / (1/payout_odds_1 + 1/payout_odds_2 + 1/payout_odds_3 + ....) ) / total_wins

                            Although I am starting to see the connection, I am fairly certain the above is incorrect because I have not taken into account the losses since your assumption at the beginning was that the losses came equally from each of the +120 and +140 odds groups.

                            This, of course, is all leading to what to do with a population of various odds and their results and arriving at a Kelly number for them (assuming it is a good random sample of the population and the calculated edge is unbiased).

                            For example, if I had the following small random sample:

                            +140 wins
                            +120 wins
                            +110 wins
                            +100 wins
                            -105 wins
                            -200 wins
                            +130 losses
                            +125 losses
                            +180 losses
                            -150 losses

                            My ROI (without weighting) is (+615.2381-400)/1000= +21.5238%

                            Now, I realize this is a very small sample and the Standard Deviation will be relatively large, but how would I calculate it?
                            Comment
                            • VideoReview
                              SBR High Roller
                              • 12-14-07
                              • 107

                              #15
                              Originally posted by Ganchrow
                              The confidence interval for the edge is 15% ± 3.177% (as 3.177% ≈ 1.96*1.621%). So the lower interval bound for payout odds of w would just be (15% - 3.177%)/w.
                              "/w". Wow. I have been trying to figure that out for almost a year. Thank you very much!
                              Comment
                              • Ganchrow
                                SBR Hall of Famer
                                • 08-28-05
                                • 5011

                                #16
                                Originally posted by VideoReview
                                If I understand correctly, if all of the initial assumptions (15% true edge, etc.) are true and I have the following distribution:
                                2000 wins at +120
                                2000 losses at -100 (odds were +120)
                                3000 wins at +140
                                3000 losses at -100 (odds were +140)
                                The total sample standard deviation (assuming to win equal quantities as you suggested) would be:
                                SQRT((6000*1.15/1.4*(1.4-0.15)+4000*1.15/1.2*(1.2-0.15) )/(1/1.2+1/1.4))/5000 = 1.6225%
                                If you were to make the a priori assumption that strategy edge were expected to be 15% irrespective of payout odds, then the standard deviation of that estimate would be:
                                sqrt(1.15*((1.2-0.15)*4000/1.2^2+(1.4-0.15)*6000/1.4^2))/(4000/1.2+6000/1.4)≈ 1.156%.

                                (Ths means that the observed return of 15.625% was about 0.625%/1.156% ≈ 0.5408 standard deviations better than expected.)

                                The 95% confidence interval for Kelly, assuming payout odds of w would then be ( 15.00% ± 2.265% ) / w.

                                If we assume the observed return of 15.63% the standard deviation remains 1.156% (the change in standard deviation is outside our level of decimal precision) so the 95% confidence would be ( 15.63% ± 2.265% ) / w.
                                Comment
                                • Ganchrow
                                  SBR Hall of Famer
                                  • 08-28-05
                                  • 5011

                                  #17
                                  OK. So here's the corrected general formula for % standard deviation assuming a single uniform (known) population edge:

                                  Let wi = payout odds (win amount) on the ith bet class
                                  Let ni = number of instances of ith bet class in sample
                                  Let E = expected strategy-wide edge

                                  σ = sqrt((1+E) * Σ( ( wi - E ) * ni / wi2) ) / Σ( ni / wi )


                                  And the 95% confidence interval for full-Kelly given payout odds of w would then be ( E ± 1.96*σ ) / w.
                                  Comment
                                  • mofome
                                    SBR Posting Legend
                                    • 12-19-07
                                    • 13003

                                    #18
                                    allow me to chime in for a moment. holy shtt.




                                    awesome as always ganch. you're an amazing individual my friend. if my iq was even 1/5th of yours i would tie my shoes and probably be an asshole. you truly encompass the phrase, 'a gentleman and a scholar'. hope your weekend is a good one.

                                    Comment
                                    • Ganchrow
                                      SBR Hall of Famer
                                      • 08-28-05
                                      • 5011

                                      #19
                                      Originally posted by mofome
                                      allow me to chime in for a moment. holy shtt.

                                      awesome as always ganch. you're an amazing individual my friend. if my iq was even 1/5th of yours i would tie my shoes and probably be an asshole. you truly encompass the phrase, 'a gentleman and a scholar'. hope your weekend is a good one.
                                      You must have missed the silly error in my standard deviation formulation starting with post #11.

                                      Anyway, it should be correct now.
                                      Comment
                                      • mofome
                                        SBR Posting Legend
                                        • 12-19-07
                                        • 13003

                                        #20
                                        Originally posted by Ganchrow
                                        You must have missed the silly error in my standard deviation formulation starting with post #11.

                                        Anyway, it should be correct now.


                                        i passed on the chance to embarrass you. i would say you owe me one, but now that i think about it, we're even.

                                        Comment
                                        • Ganchrow
                                          SBR Hall of Famer
                                          • 08-28-05
                                          • 5011

                                          #21
                                          Originally posted by mofome
                                          i passed on the chance to embarrass you. i would say you owe me one, but now that i think about it, we're even.
                                          Thank, Bro. I appreciate that.

                                          What say I just buy you a new headband and we call it even?
                                          Comment
                                          • mofome
                                            SBR Posting Legend
                                            • 12-19-07
                                            • 13003

                                            #22
                                            Originally posted by Ganchrow
                                            Thank, Bro. I appreciate that.

                                            What say I just buy you a new headband and we call it even?


                                            Comment
                                            • VideoReview
                                              SBR High Roller
                                              • 12-14-07
                                              • 107

                                              #23
                                              Originally posted by Ganchrow
                                              OK. So here's the corrected general formula for % standard deviation assuming a single uniform (known) population edge:

                                              Let wi = payout odds (win amount) on the ith bet class
                                              Let ni = number of instances of ith bet class in sample
                                              Let E = expected strategy-wide edge

                                              σ = sqrt((1+E) * Σ( ( wi - E ) * ni / wi2) ) / Σ( ni / wi )


                                              And the 95% confidence interval for full-Kelly given payout odds of w would then be ( E ± 1.96*σ ) / w.
                                              I hope I am not embarrasing myself too much by showing how little I know about formulas but I think In 25+ years of gambling, next only to the Kelly formula, the above equation is the most beautiful thing I have ever seen.

                                              Ganchrow, thank you for taking the time to explain (and re-explain) the answers to what I was looking for.
                                              Comment
                                              Search
                                              Collapse
                                              SBR Contests
                                              Collapse
                                              Top-Rated US Sportsbooks
                                              Collapse
                                              Working...