1. #36
    tomcowley
    tomcowley's Avatar Become A Pro!
    Join Date: 10-01-07
    Posts: 1,129
    Betpoints: 6786

    If that 3-effect is real, it's something really really strange. I created 1 million synthetic 13-15 game salamis by picking that many random games from the last 8 years and adding up the MOVs. My push%s for each number in range were 2.3x% and there were no abnormally weak or strong numbers for 13, 14, or 15 games.

  2. #37
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    Quote Originally Posted by tomcowley View Post
    If that 3-effect is real, it's something really really strange. I created 1 million synthetic 13-15 game salamis by picking that many random games from the last 8 years and adding up the MOVs. My push%s for each number in range were 2.3x% and there were no abnormally weak or strong numbers for 13, 14, or 15 games.
    Yeah following your lead, I ran 10,000,000 15-game synthetic Salamis (without replacement per Salami, FWIW) randomly selected from 1990-2009 data selected via the 32-bit Mersenne Twister PNRG fully seeded via random.org.

    And with all that I'm seeing essentially the same results as you've described. I still have no explanation for the observed 3-effect (and 3-multiple) effect. But based upon these results, I have no choice but to concede that my earlier dismissal of the possibility of a statistical aberration was likely made considerably too hastily. (Although I'm not ruling out earlier programmer error on my part either).

    In case anyone's interested, here's the hastily hacked together Perl code I used to simulate the 15-game Salami's:

    Code:
    #!perl
    
    use strict;
    use Math::Random::MT;
    # Mersenne Twister module available from CPAN
    # http://search.cpan.org/~ams/Math-Random-MT-1.11/MT.pm
    # requires Perl 5.10.0 or higher
    
    use constant SIZE => 15;
    use constant TRIALS => 10_000_000;
    
    my (@movs, $rand_gen, $sum_r, );
    BEGIN {
    	warn "Seeding random number generator.\n";
    	require LWP::Simple;
    	my $RAND_URL=\("http://random.org/integers/?num=1248&min=0&max=65535&col=2&base=10&format=plain&rnd=new");
    	my (@seed);
    	foreach (split(/\n/, LWP::Simple::get($$RAND_URL))) {
    		m/^([0-9]+)\s+([0-9]+)$/;
    		push @seed, $1 + $2*2**16;
    	}
    	$rand_gen = Math::Random::MT->new(@seed);
    	warn "Random number generator seeded.\n";
    }
    
    while(<>) {
    	next unless m/^[12][0-9]{7}/;
    	chomp;
    	my($date, $away, $home, $mov,) =split;
    	push @movs, $mov;
    }
    
    for(my $i=1; $i<=TRIALS; $i++) {
    	my $selected = {};
    	my $sum = 0;
    	warn "TRIAL# $i\n" unless $i % 10_000;
    	for (my $j=0; $j < SIZE; $j++) {
    		my $r = int($rand_gen->rand($#movs + 1));
    		redo if $selected->{$r};
    		$selected->{$r} = 1;
    		$sum += $movs[$r];
    	}
    	$sum_r->{$sum}++;
    }
    
    foreach my $sum (sort {$a <=> $b} keys %{ $sum_r } ) {
    	print "$sum\t$sum_r->{$sum}\n";
    }



    And the results from the 10,000,000 sim:


    Hard to argue with that.

  3. #38
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    Quote Originally Posted by Ganchrow View Post
    So for a 15-game Salami the CLT would predict a mean and and standard deviation of 1.94139 and 17.00553 runs respectively. The following table compares the predicted frequencies (using a continuity correction) with actual 15-game Salami results over the in-sample time period:
    I think the most interesting take-away from this would be the low observed frequency of the 3-run home Salami MOV (and, it turns out, for subsequent multiples of 3).
    I just cannot see any significance of the number 3. However, I suspect that 7 should produce noticeable anomalies. As you can see, MOV -4 is less than expected, same as MOV 3. So, I am wondering what are the numbers for MOV -11 and MOV 10. The effect should exist albeit to a lesser extent.

  4. #39
    flyingillini
    flyingillini's Avatar SBR PRO
    Join Date: 12-06-06
    Posts: 41,218
    Betpoints: 2187

    It's nice to see Ganch posting this information!

  5. #40
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    Quote Originally Posted by Data View Post
    I just cannot see any significance of the number 3. However, I suspect that 7 should produce noticeable anomalies. As you can see, MOV -4 is less than expected, same as MOV 3. So, I am wondering what are the numbers for MOV -11 and MOV 10. The effect should exist albeit to a lesser extent.
    Nor can I.

    Unless we're both missing something however, TomCowleys' experiment (followed by my reproduction of his results -- now that's real science reproducible experimenation ) does rather strongly suggest that to be an aberration.

    Also don't discount the the possibility that I simply made a mistake in my earlier data culling. While, I have rechecked it, someone else might want to verify my initial findings. The fact is that it does represent a fairly large outlier (although once again because we're dealing with several strata of categorical data, the results are not actually as extreme as they might appear at first glance).

    Ayway, if you're interested in CLT predicted vs. actual results over a larger support:Just remember of course that CLT convergence drops off as we further approach the distribution tails.

    Oh and btw, Data, I just checked my ledger and it appears that you still owe me a drink. Please don't make me call you a stiff on the open forum.

  6. #41
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    Quote Originally Posted by flyingillini View Post
    It's nice to see Ganch posting this information!
    Thanks for the kind for the kind words,

    And I find it nicer still to see so many people getting involved in these conversations.

  7. #42
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    Quote Originally Posted by Ganchrow View Post
    TomCowleys' experiment (followed by my reproduction of his results -- now that's real science reproducible experimenation ) does rather strongly suggest that to be an aberration.
    I am sorry but I do not put too much weight into these results due to my complete disagreement with a blind assumption that n-game salami distribution is sufficiently close to a distribution of randomly selected n games. There are parameters that make any given salami a non-random set, some of those parameters like teams' relative strength or cold/hot season are self-evident while likely there are others that are not immediately obvious. Please note, this assumption may be proven correct at the end with more research done but the initial assumption that I would make is that non-randomness must not be ignored.

    Here is my "back of the envelope" take on this. A 15-game salami will fall into 16 subsets where a home team wins 0 to 15 games. Lets assume that each "step" from one subset to another will result in changing the Home team MOV by 2.6 and the Away team MOV by 3. (Note that this numbers seems reasonably close to the medians and should not be raising eyebrows). With that assumption, here are the expected maximums and minimums in each subset's distribution with minimums positioned right in the middle.

    As far as I can tell, this table matches your real observed results pretty well. I am not saying this is nearly accurate as 1,000,000 simulations results but I tend to think that this is a better approach in both, the logic behind and the results.

    Oh and btw, Data, I just checked my ledger and it appears that you still owe me a drink. Please don't make me call you a stiff on the open forum.
    Please be reminded that our payout method requires customer's physical presence in ***** city. Should you satisfy this requirement you can request your payout on any day.
    Last edited by Data; 07-04-10 at 01:11 PM.

  8. #43
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    First off, Data, all I can say is that finally someone other than myself has begun using the [table][/table] BB tags. Good on you, ya Cossack. A trend brewing perhaps? It warms my otherwise frigid heart.

    Quote Originally Posted by Data View Post
    I am sorry but I do not put too much weight into these results due to my complete disagreement with a blind assumption that n-game salami distribution is sufficiently close to a distribution of randomly selected n games. There are parameters that make any given salami a non-random set, some of those parameters like teams' relative strength or cold/hot season are self-evident while likely there are others that are not immediately obvious. Please note, this assumption may be proven correct at the end with more research done but the initial assumption that I would make is that non-randomness must not be ignored.
    Fair enough, although I'm not at first blush particularly inclined to agree. Still, I can't immediately offer compelling evidence to the contrary.

    Quote Originally Posted by Data View Post
    Here is my "back of the envelope" take on this. A 15-game salami will fall into 16 subsets where a home team wins 0 to 15 games. Lets assume that each "step" from one subset to another will result in changing the Home team MOV by 2.6 and the Away team MOV by 3. (Note that this numbers seems reasonably close to the medians and should not be raising eyebrows). With that assumption, here are the expected maximums and minimums in each subset's distribution with minimums positioned right in the middle.

    As far as I can tell, this table matches your real observed results pretty well. I am not saying this is nearly accurate as 1,000,000 simulations results but I tend to think that this is a better approach in both, the logic behind and the results.
    I think that's certainly a fair initial step in what ideally would become a larger combinatorial analysis. My primary objection, however, would be that by reducing the variance via solely considering the medians of the two states (i.e., home win and home loss) and ignoring the tails, we'd necessarily be creating a results distribution more discrete than what we'd find in reality.

    In an attempt to be as fair-minded as you, however, (at least in this post ), I do have to concede that this certainly does present the beginnings of what could be a strong counter-theory.

    With that in mind and in attempt to further dissect the data, following are frequency analyses broken down by periods:









    These do rather clearly show that the 3-run gap appears fairly uniformly from year-to-year (excepting 1990-1998, a period notable for the paucity of 15-game Salamis). Less so for the 6-run, but still not to what might be construed a negligible extenet.

    Now looking at it from month to month:












    So it does indeed seem that (March/April excluded) this is a consistent phenomenon from month to month. How statistically relevant is this in light of both our prior in-sample observation of the 3-run phenomenon and the small sample sizes of each of our data partitions? Well, that's a bit too much multinomial statistics for me to wade through on a Monday, but my first inclination would be "not irrelevant but still probably less relevant than it might appear at first glance".

    Anyway, as I've already reversed my opinion on this at least once and arguably twice, I'm going to temporarily recuse myself and wait and see if any other analysts among us can make some compelling arguments.

    Quote Originally Posted by Data View Post
    Please be reminded that our payout method requires customer's physical presence in ***** city. Should you satisfy this requirement you can request your payout on any day.
    Stiff.

  9. #44
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    Cossack, stiff... With Ganchrow's departure the TT went downhill with posters resorting to name calling and posting pictures. Pathetic...

  10. #45
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    Quote Originally Posted by Ganchrow View Post
    My primary objection, however, would be that by reducing the variance via solely considering the medians of the two states (i.e., home win and home loss) and ignoring the tails, we'd necessarily be creating a results distribution more discrete than what we'd find in reality.
    Sure, but I was not going to ignore the tails. I was merely attempting to make some sense of the "bumps". Kind of taking the sims as a first approximation and then introducing some small "waves" instead of a curve line.

  11. #46
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    Quote Originally Posted by Data View Post
    Sure, but I was not going to ignore the tails. I was merely attempting to make some sense of the "bumps". Kind of taking the sims as a first approximation and then introducing some small "waves" instead of a curve line.
    Of course. What we're all just trying to get at is a reasonable explanation for the relative dearth of 3-run Home Salami wins. Both you and Cowley have each produced somewhat competing arguments, each with merits, each with holes. Me, I'm just hoping to be convinced one way or the other before I have to do any serious thinking.

  12. #47
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    Quote Originally Posted by Data View Post
    Cossack, stiff... With Ganchrow's departure the TT went downhill with posters resorting to name calling and posting pictures. Pathetic...
    Did you miss your last appointment with Dr. Soong (nerd alert) for installation of your upgraded humor chip?

  13. #48
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    Quote Originally Posted by Ganchrow View Post
    Did you miss your last appointment with Dr. Soong (nerd alert) for installation of your upgraded humor chip?
    Perhaps, but this only says that, unlike for you, there is a hope for me.

  14. #49
    Ganchrow
    Nolite te bastardes carborundorum.
    Ganchrow's Avatar Become A Pro!
    Join Date: 08-28-05
    Posts: 5,011
    Betpoints: 1088

    Quote Originally Posted by Data View Post
    Perhaps, but this only says that, unlike for you, there is a hope for me.
    I abandoned all hope years ago.

  15. #50
    mathdotcom
    mathdotcom's Avatar Become A Pro!
    Join Date: 03-24-08
    Posts: 11,689
    Betpoints: 1943

    So today we have:

    +155/-175 at CRIS (and currently +150/-160 at Pinn)
    Home -4.5 -105 at CRIS

    If we take the fair line to be 155, using Ganchrow's table the probability of the home MOV to be more than 4 is ~ 0.5226 > break even probability of 0.5122 at -105.

    With a fair line of 165, the probability of home MOV > 4 is 0.5374.

    Pinn has -4.5 @ -103, too. What am I missing?

  16. #51
    tomcowley
    tomcowley's Avatar Become A Pro!
    Join Date: 10-01-07
    Posts: 1,129
    Betpoints: 6786

    The points are going to be worth more in general than the push%s above because those push %s are for all salamis (it's like asking what the NFL 3 push % is by looking at all the games instead of the games lined in the neighborhood of 3). Also, 12 game salami today, so the points are worth a bit more.

  17. #52
    mathdotcom
    mathdotcom's Avatar Become A Pro!
    Join Date: 03-24-08
    Posts: 11,689
    Betpoints: 1943

    Good point tom

    I will be back the next day there are 15 games

  18. #53
    mathdotcom
    mathdotcom's Avatar Become A Pro!
    Join Date: 03-24-08
    Posts: 11,689
    Betpoints: 1943

    Cris:
    Away +4 -105
    Home -4 -115

    Away ML +150
    Home ML -170

    If fair odds on ML are 160, then again using Ganch's table the probability of Home MOV > 4 is ~ 0.5301, which suggests a fair line of:

    Home -4.5 -113

    Pinnacle currently has -4.5 -107 with Away/Home as +156/-166.

    Nothing to get excited about but there seems to be a small bias.

First 12
Top