## View Poll Results: Which bet is better value?

Voters
6. You may not vote on this poll
• Win

4 66.67%
• Top 3

2 33.33%
• Top 5

0 0%
View New Posts
1. ## Win v Place odds value - Math Question

If I rate a runner in a field of 30 as a +1600 chance, and have these options to bet on them, how do I calculate which one is the Better Value?

Win only: +2000
Top 3: +520
Top 5: +310
175 pts

3-QUESTION
SBR TRIVIA WINNER 08/04/2022

SBR Bash
Punta Cana
Attendee 2/4/2017

2. It all would come down to the relative strengths you assigned to the other runners.

If you judge them all to be equally matched each with the same probability of losing relative to the "good runner" and each with the same probability of winning/losing to any other "bad runner" then it's easy:

If the good runner's fair odds are , then that would imply a 1 17 ≈ 5.88235% of his winning the race.

Along with the above assumptions, this further implies a 295.88235% ≈ 90.69236% probability of of the "good runner" finishing higher than any given "bad runner".

Hence his probability of finishing in exactly 2nd place would be: (1-5.88235%) * 90.69236%28 ≈ 6.10452%.

Exactly 3rd place: (1-5.88235%-6.10452%) * 90.69236%27 ≈ 6.29444%.

Exactly 4th place: (1-5.88235%-6.10452%-6.29444%) * 90.69236%26 ≈ 6.44407%.

Exactly 5th place: (1-5.88235%-6.10452%-6.29444%-6.44407%) * 90.69236%25 ≈ 6.54511%.

So the probability of winning the +2000 bet is 5.88235%, for an edge of 21 * 5.88235% ≈ 23.5294%.

So the probability of winning the +520 bet is 5.88235% + 6.10452% + 6.29444% ≈ 18.28131%, for an edge of 6.2 * 18.28131% ≈ 13.34413%.

And the probability of winning the +310 bet is 18.28131% + 6.44407% + 6.54511% ≈ 31.27049%, for an edge of 4.1 * 31.27049% ≈ 28.20901%.

Figuring out appropriate corresponding Kelly stakes is left as a exercise for the interested reader.

3. 1.18% of BR on the Win, or 2.57% on Top3, or 9.1% on the Top5 bet.

Wow Ganchrow, thankyou. And double thankyou for explaining it so well.

Intuitively, I had expected the order of value to come out; Win, then top 3, then top 5. Being so wrong with that guess reinforces how important this is for me.

Hope you don't mind a couple of follow up questions.

1) Is this logic correct? Runner A has a 92.94% implied chance of beating any other, Runner B has a 91.73% chance. Meaning A has 1.21% more chance in a 1 on 1 matchup than B? Therefore ((1-0.0121)/(0.0121+1))+1 = \$1.98 is the implied fair odds on the underdog? (or should I be looking at the ratio of their probabilities?)

2) I'd very much like to know how to calculate this out when I don't "judge them all to be equally matched each with the same probability of losing" too, if you have the time and patience to set that out as well.
175 pts

3-QUESTION
SBR TRIVIA WINNER 08/04/2022

SBR Bash
Punta Cana
Attendee 2/4/2017

4. What kind of "runner"?

Joe.

5. Originally Posted by u21c3f6
What kind of "runner"?

Joe.
For me, it's Car drivers. Formula 1 and NASCAR. Should suit stuff like overall World Cup betting too of course.

Those figures were for Montoya in the Pocono race. He finished 8th.
175 pts

3-QUESTION
SBR TRIVIA WINNER 08/04/2022

SBR Bash
Punta Cana
Attendee 2/4/2017

6. Originally Posted by Optional
For me, it's Car drivers. Formula 1 and NASCAR. Should suit stuff like overall World Cup betting too of course.

Those figures were for Montoya in the Pocono race. He finished 8th.

I can't answer how it may work in car racing or World Cup but here is a thought as it relates to horse racing which may or may not apply to your problem.

There have been books about looking for inefficiencies in the place and show pools in horse racing. Probably most notably by Ziemba and Hausch. They also came up with a formula to try to determine probabilities of finishing in the money from the win odds. They did a decent job and one could find inefficiencies though they never approached the supposed edge as calculated by the formula. This happened because there are horses that will win when they are ready to win and not necessarily finish in the money and conversely, horses that will finish in the money but for whatever reason just can never seem to win.

Are there race car drivers like that where if they are close they win but very rarely finish second or third or in the top 5? This may need to be a consideration when trying to determine probabilities. It may not be linear.

Joe.

7. Very good observation. I doubt the effect will be anywhere near as marked in NASCAR, because everyone races an entire season, aims to peak at the same time and goes out to spell at the same time.

I'm not sure the sample size will be enough with only 36 races per year, but I've just added it to my todo list to check out anyway. Sounds like something that might at least help improve my handicapping, if I can spot any individuals that trend that way.
175 pts

3-QUESTION
SBR TRIVIA WINNER 08/04/2022

SBR Bash
Punta Cana
Attendee 2/4/2017

8. Originally Posted by Ganchrow
If the good runner's fair odds are , then that would imply a 1 17 ≈ 5.88235% of his winning the race.

Along with the above assumptions, this further implies a 295.88235% ≈ 90.69236% probability of of the "good runner" finishing higher than any given "bad runner".
My apologies but I completely dropped the ball on this one.

Let me start over.

If the "good runner" has a 5.88235% of winning the race, then each of the other runners have a (1-5.88235%)/29 ≈ 3.24544% probability of winning the race.

The good runner's probability of placing 2nd conditioned on his not placing 1st would then be 5.88235% / (1 - 3.24544%) ≈ 6.0797%, which is simply his winning probability adjusted for the reduced pool of runners.

To get his absolute probability of finishing 2nd we simply multiply by his own probability of not finishing 1st.

In other words:

P(2nd place finish) = P(2nd place finish | not 1st place finish) * P(Not 1st place finish)
P(2nd place finish) = 6.0797% * (1 - 5.88235%) ≈ 5.72204%

So in general the good runner's probability of placing in position k ≤ 30, given that he did not finish in positions 1 ... k-1 would be:

P(kth place finish | not 1st-(k-1)th place finish) = 5.88235% / ( 1- k * 3.24544%)

And the absolute probability of a kth place finish would then be:

P(kth place finish) = P(kth place finish | not 1st-(k-1)th place finish) * P(not 1st-(k-1)th place finish)
[nbtable][tr][td]= 5.88235% / ( 1- (k-1) * 3.24544%) * (1-[/td] [td] [/td] [td] P(i-1)th place finish) ) [/td] [/tr] [/nbtable]

So:

So:

Edge(+2000 bet) = 5.88235% * 21 - 1 ≈ 23.5294%
Edge(+520 bet) = 17.16507% * 6.2 - 1 ≈ 6.42343%
Edge(+310 bet) = 27.79795% * 4.1 - 1 ≈ 13.97160%

Which yields full Kelly stakes of ~ 0.56555% on the win bet, 0% on the top 3 bet, and 3.94141% on the top 5 bet.

I should also note that the probability of the good runner beating any any other given runner would be:

5.88235% 5.88235% + 3.24544% ≈ 64.44444%

Apologies once again. I just completely spaced.

Originally Posted by Optional
Is this logic correct? Runner A has a 92.94% implied chance of beating any other, Runner B has a 91.73% chance. Meaning A has 1.21% more chance in a 1 on 1 matchup than B? Therefore ((1-0.0121)/(0.0121+1))+1 = \$1.98 is the implied fair odds on the underdog? (or should I be looking at the ratio of their probabilities?)
Assuming regularity conditions hold, runner A's probability of beating runner B would be given by:

Pr(A/B) = Pr(A/Field)*(1-Pr(B/Field)) Pr(A/Field)*(1-Pr(B/Field)) + (1-Pr(A/Field)) * Pr(B/Field)

Or more succinctly put using the logit function (defined as lg(x) = log(x) - log(1-x)):

lg(Pr(A/B)) = lg(Pr(A/Field)) - lg(Pr(B/Field))

Either way, Pr(A/B) ≈ 54.27191%

To get the probabilities of winning the entire race, we have:

Pr(A) = Pr(A/Field)*Pr(A/B)/(Pr(A/Field)+28*Pr(A/B)*(1-Pr(A/Field)))
Pr(A) ≈ 25.19185%

Pr(B) = Pr(B/Field)*(1-Pr(A/B))/(Pr(B/Field)+28*(1-Pr(A/B))*(1-Pr(B/Field)))
Pr(B) ≈ 21.22599%

Pr(Other) = (1-Pr(A)-Pr(B))/28
Pr(Other) ≈ 1.91365%
Originally Posted by Optional
I'd very much like to know how to calculate this out when I don't "judge them all to be equally matched each with the same probability of losing" too, if you have the time and patience to set that out as well.
As you can probably tell, these kinds of combinatoric problems can get pretty messy pretty quickly. The best way to tackle it, provided you had the know-how, would just be to write a program to traverse the different combinations.

You just need to remember that at each conditional state all the probabilities need to sum to unity and then keep applying either Bayes' theorem to convert to absolute probabilities and the logit function to compare probabilities between contestants for whom no direct heads-up probability is given.

9. Thanks again Ganchrow. I had to read that lot several times, but have got it now.

3more quick follow up questions though, if you don't mind.

1) A NASCAR field has 43 starters. When I do my handicap ratings, I generally end up with 5 to 10 drivers on zero rating. In my calculations I have treated it as field with however many runners I've actually rated with some chance. The combined probability of the remainder of starters is likely well under 1%.

For best accuracy, should I ignore them completely as I am now? Or should I be calculating the above equations with the full 43 starters, including those with 0% rated prob?

2) I have intermediate experience with PHP, and I 'think' I can come up with the program you describe. (with a bit of research beyond my high school math education) But...

Should I be looking at trying it with a different language that might prove more suitable for this type of work in future?

3) How much utility would you give to the current figures that don't take into account the relative probability of each runner? Can I rely on them as being useful at all?
175 pts

3-QUESTION
SBR TRIVIA WINNER 08/04/2022

SBR Bash
Punta Cana
Attendee 2/4/2017

10. Do these figures look right to you Ganchrow?

With the original calcs the distribution looked a lot more like reality, and not evenly graduated. The third ranked runner's probability to finish 3rd was highest, and fell away either side for instance.

I've triple checked my work, so guess it's right, but just thought I'd ask.

(BAO = Beat Any Other)

\
175 pts

3-QUESTION
SBR TRIVIA WINNER 08/04/2022

SBR Bash
Punta Cana
Attendee 2/4/2017

11. Originally Posted by Optional
Do these figures look right to you Ganchrow?

With the original calcs the distribution looked a lot more like reality, and not evenly graduated. The third ranked runner's probability to finish 3rd was highest, and fell away either side for instance.

I've triple checked my work, so guess it's right, but just thought I'd ask.

(BAO = Beat Any Other)

-image snipped-
I'm not sure I really understand the posted table.

To what exactly does the Pr(Beat Any Other) refer? I'd assume that the probability of a given driver "beating any other" would be dependent on the driver whom he was facing. For example, shouldn't Pr(#1 beating #2) < Pr(#1 beating #30)? Or am I missing something?

Also does the table assume in any way assume the additional drivers to which you had alluded (i.e., numbers 37 through 43)?

12. Each of the runners are sorted by a handicap rating I gave them. (I probably should have included that column too, to help make better sense of it)

BAO refers to the calculation you gave above for "the probability of the good runner beating any any other given runner". It wasn't necessary for me to include it there, I was really just asking about the distribution of the Win thru 5th columns.

The table includes the 36 of 43 runners I had rated >0 in the race only. The others are not factored into any calculations there.
175 pts

3-QUESTION
SBR TRIVIA WINNER 08/04/2022

SBR Bash
Punta Cana
Attendee 2/4/2017

13. OK I see what you're saying.

"The probability of the good runner beating any any other given runner" in the first example was straightforward in that we had assumed that every runner was of equal ability.

In the second example, that self-same probability referred to A's probability of beating any runner OTHER THAN B, and similarly to B's probability of beating any runner OTHER THAN A.

What you're doing is looking at each runner in isolation and given, say, a win probability, figuring out the remaining 2nd-5th place probabilities, without regard to the other runners' individual probabilities (i.e., assuming them all to be of equal skill). That's essentially what we had discussed.

As far as that particular analysis goes, I believe that you mad a slight error in calculating the per row win probabilities for the "other" players. Specifically it appears as if you're using Pr(Other Player Win) = (1- Pr(Given Player Win) ) / 34

As you have 36 drivers, however, the denominator should in fact be 35.

But beyond that your results look fine.

Putting that aside, what follows is a simple Perl script which, given a set of absolute win probabilities, normalizes them and then outputs the probability of each driver finishing in each of 1st - 5th place (the latter number may be adjusted via the constant RELEVANT_PLACES).

Script follows:

Code:
```#!perl

use strict;

############################################################################
## TITLE: race_place_probs.pl
## AUTHOR: Ganchrow (ganchrow@yahoo.com)
## SYNOPSIS: Reads from STDIN a set of newline-separated absolute win
## probabilities for any number of race particpants. This script will first
## normalizes the input probabilities, ensuring they sum to unity, and then
## will send to STDOUT the probabilities of each participant
## finishing in each of 1st through +RELEVANT_PLACES place.
## The script iterates recursively through the structure of win
## probabilities, summing up the probabilities of every feasible outcome
## (up to RELEVANT_PLACES places each).
############################################################################
##
## This program is free software: you can redistribute it and/or modify
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You may find a copy of the GNU General Public License
############################################################################

use Time::HiRes;

## Change the following line to determine for how many finishing positions probabilities are output.
use constant RELEVANT_PLACES => 5;	# number of places for which we want to calculate win probabilities

## Change the following line to control the output precision inclusive of the percentage.
## A value of 4, for example, would display output in the form #.##%
use constant OUTPUT_PRECISION => 8;	# number of decimal places for output after percentage

## DO NOT change the following line
use constant REGEXP_FLOAT => qr/^(?:[+-]?)(?=\d|\.\d)\d*(?:\.\d*)?(?:[Ee](?:[+-]?\d+))?\$/;

my \$start_time = Time::HiRes::time();

MAIN: {
my \$place_probs_r = [ [] ]; 	# \$place_probs_r->[\$x]->[\$i] = probability of driver# \$i+1 finishing in place \$x+1

&recurse(\$place_probs_r);
&display_probs(\$place_probs_r);
}

sub recurse {
my \$ptr = shift;			# pointer to probability structure
my \$n = (				# total number of drivers
shift ||
scalar( @{\$ptr->[0]} )
);
my \$cur_neg_prob = (shift || 1);	# current inverse probability of outcome
my \$cur_adj_factor = (shift || 1);	# current adjustment factor for outcome
my \$included_r = (shift || {});		# HASH ref of driver numbers already included in outcome
(my \$recursion_level = (shift || 0))++;	# winning position # currently being evaluated

for(my \$i = 0; \$i < \$n; \$i++) {
next if defined(\$included_r->{\$i});

my \$prob = \$ptr->[0]->[\$i];
\$ptr->[\$recursion_level-1]->[\$i] += \$prob * \$cur_adj_factor if \$recursion_level > 1;
if (\$recursion_level < RELEVANT_PLACES) {
my \$neg_prob = \$cur_neg_prob - \$prob;
\$included_r->{\$i} = 1;
&recurse(\$ptr, \$n, \$neg_prob, \$adj_factor, \$included_r, \$recursion_level);
undef(\$included_r->{\$i});
}
}
}

my \$win_probs_r = shift;

my \$total_prob = 0;	# normalization factor
my \$n = 0;		# number of drivers
while(<>) {
# read in data file of win probabilities

s/[^0-9.%Ee+-]+//gs; # remove white space and non-numeric characters

if (! m/\d/ ) {
warn "Skipping non-numeric line# \$.\n";
next;
}

\$_ /= 100 if(s/%\$//); 	# adjust if probs quoted as percentages

if (\$_ =~ REGEXP_FLOAT and \$_ > 0){
# add win probability to array refenced by
# \$place_probs_r->[0] (1st place finish)
# increase \$total_prob so we can normalize later

\$win_probs_r->[\$n++] = 0+\$_;
\$total_prob += \$_;
} else {
warn "Skipping line # \$. with prob=\$_\n";
next;
}
}

if (\$n < RELEVANT_PLACES) {
die "Invalid input data: Only \$n drivers and " . RELEVANT_PLACES . " places.\n";
} elsif (\$total_prob <= 0) {
die "Invalid input data: Total probability = \$total_prob\n";
}
&normalize_probs(\$win_probs_r, \$total_prob);
}

sub normalize_probs {
my \$win_probs_r = shift;
my \$total_prob = shift;
if (\$total_prob != 1 ) {
# normalize win probabilities to ensure they sum to unity
warn "Normalizing win probabilities by a factor of \$total_prob\n";
@{\$win_probs_r} = map { \$_ /= \$total_prob } @{\$win_probs_r};
}
}

sub display_probs {
my \$ptr = shift;
my \$prec = int(+OUTPUT_PRECISION - 2);
\$prec = 0 if \$prec < 0;
my \$n = scalar(@{\$ptr->[0]});
my \$buffer = '';
for (my \$i = 1; \$i <= RELEVANT_PLACES; \$i++) {
\$buffer .= "\t\$i";
}
print "\$buffer\n";
\$buffer = '';
for (my \$i = 0; \$i < \$n; \$i++) {
\$buffer = (\$i+1) . '';
for (my \$j = 0; \$j < RELEVANT_PLACES; \$j++) {
\$buffer .= sprintf("\t%0.\${prec}f%%", 100*\$ptr->[\$j]->[\$i]);
}
print "\$buffer\n";
\$buffer = '';
}
}

END {
my \$end_time = Time::HiRes::time();
warn "Script completed in " . sprintf("%0.02f", ( \$end_time - \$start_time )) . " seconds.\n";
exit 0;
}```

Using your 36 win probabilities as given, the script took a shade under 100 seconds on my otherwise engaged PC to complete for the top 5 places (calculating win probabilities for the top 6 places would take about 30 times longer). Recoding the recurse() subroutine in C should speed up execution time by at least an order of magnitude.

Output is as follows:

All the real work is done by the recurse() subroutine, which calculates probabilities for each of the 36P5 = 45,239,040 possible outcomes, while simultaneously updating the array of arrays structure referenced by \$place_probs_r with the probability of each participant finishing in the given place ≤ RELEVANT_PLACES. An anonymous hash stores the participants who have already been included in a given outcome so as to ensure that each participant may only finish in a single place. Other than that I think it should be fairly straightforward even as pseudocode.

14. Thankyou. You just saved me many hours.

I'm still surprised by the result distribution there too though. I guess my intuition was wrong, that the 3rd ranked driver would have a higher probability of finishing 3rd than 2nd or first.
175 pts

3-QUESTION
SBR TRIVIA WINNER 08/04/2022

SBR Bash
Punta Cana
Attendee 2/4/2017

15. Originally Posted by Optional
I'm still surprised by the result distribution there too though. I guess my intuition was wrong, that the 3rd ranked driver would have a higher probability of finishing 3rd than 2nd or first.
That's just a function of the selected initial distribution of win probabilities.

For example, if you were to bump up driver# 1's win probability to 18.5%, and then proportionally reduce the win probs for the remaining drivers then for driver #3:

Pr(1st) = 5.800617%
Pr(2nd) = 5.858729%
Pr(3rd) = 5.870338%
Pr(4th) = 5.839875%
Pr(5th) = 5.771500%

Which is probably more in line with your intuition.

You'd find a similar dynamic at play were you to sufficiently lower the relative win probabilities of enough of the lesser ranked drivers.

16. Originally Posted by Ganchrow
Using your 36 win probabilities as given, the script took a shade under 100 seconds on my otherwise engaged PC to complete for the top 5 places (calculating win probabilities for the top 6 places would take about 30 times longer). Recoding the recurse() subroutine in C should speed up execution time by at least an order of magnitude.
Try two orders of magnitude.

I recoded the Perl recurse() subroutine as the C++ function fnRaceRecurse. For the 36 drivers previously given it calculated the win probabilities for the top 5 places in less than half a second, the top 6 places in about 12 seconds, and the top 7 places in about 6 minutes,

It works the same way as the Perl sub, calling itself recursively across every outcome. The only differences are:
• the \$included_r hash reference is now a boolean-valued array
• the output probability array, pdOutProbs[], no longer includes 1st place win probabilities.
• pdOutProbs[] is linearized such that pdOutProbs[iDrivers * (n - 2) + i] represents the ith driver's probability of finishing in place n ≥ 2)
• a pointer to the start of the pdOutProbs array (which must be correctly sized and initialized to zero) is now passed between function calls

Attached is a zip file containing the above source compiled into a Win32 DLL (race.dll), an Excel demonstration file (race_place_probs.xls), as well as the VC++ source code.

The VBA wrapper function defined in race_place_probs.xls is:

Public Function RacePlaceProbs(vWinProbs, Optional ByVal lRelevantPlaces As Long = 3) As Double()

The first argument is an array of (already normalized) win probabilities, and the second argument the number of places for which probabilities need be calculated (it defaults to 3).

It can be called from Excel as an array function (demonstrated in the spreadsheet) and the output will be # of drivers rows by (lRelevantPlaces-1) columns.

Just make sure to extract the DLL to the same directory as the spreadsheet (or to a directory in your PATH) or else it won't run.

17. Originally Posted by Ganchrow
it calculated the win probabilities for the top 5 places in less than half a second, the top 6 places in about 12 seconds, and the top 7 places in about 6 minutes
My PC is a lot slower than yours. I ambitiously started a test to calculate the top 10 places about an hour ago and it's reporting only 40% done so far.

You have taught me a lot, and left me with exactly the calculation solution I set out looking for, plus more.

After looking at the output, I think I will try refining my handicapping system to exclude even more runners from the final calculations, as I think the lower probability results are just creating unnecessary noise when it comes to the more important top 15 ranking.