1. #1
    boxcar
    boxcar's Avatar Become A Pro!
    Join Date: 03-05-08
    Posts: 31

    Probability with small sample size

    I have a question about the reliability of collections of data with somewhat small sample sizes that add up to a big sample size.

    If you have a system in which there are several (15-20) independent categories of wagers yet based on similar theories, which if taken separately are small samples (like 10-20 events/yr, some bigger, some smaller), but when they're all added up the annual sample size totals in the hundreds and has in aggregate similar results from year to year. I.e., category 1 could underperform in year 2, but Category 2 overperformed, and so on down the line and the underperformers are evened out by overperformers and when you added the results of all the categories for that year they totaled 57% winning percentage, and every year was about the same, right around 57%.

    So in the aggregate you have pretty remarkably consistent results, but based on a collection of small independent but logically related categories, whose results vary from year to year. Do you have a statistically reliable edge here?

  2. #2
    marcoforte
    marcoforte's Avatar Become A Pro!
    Join Date: 08-10-08
    Posts: 140
    Betpoints: 396

    I got hammered last year playing a system similar to yours. Small group sizes 20-30 with win rates of 65% or better. I proved the statistical theorem of regression to the mean. They failed miserably. Now I use a sample size of 60 which is statistically significant. Picked 60 based on work I do in my full-time job where the FDA uses 60 as statistically sound sample for quality control purposes.

  3. #3
    Peep
    Peep's Avatar Become A Pro!
    Join Date: 06-23-08
    Posts: 2,295

    I would get some "pristine data", i.e. data other than that which you used to find/discover/shape/whatever you call it your plays.

    IF it tests successfully on the new unfootprinted data, you MAY have something.

    I like Marcoforte's post, sometimes that which looks great in the rear view mirror just don't cut it going forward (and sometimes it does).

  4. #4
    BuddyBear
    Update your status
    BuddyBear's Avatar Become A Pro!
    Join Date: 08-10-05
    Posts: 7,233
    Betpoints: 4805

    I think this title is poorly stated and a lot of what you are saying doesn't make much sense either to me. "Probability" can occur given any number of events so sample size isn't necessarily relevant when you are talking about the probability of event occuring. I think what you are trying to ask is, what is the probability of X occuring under a specified set of conditions. or P(X) = .??

    Can you phrase this question in a more concise and clear way?
    Are you talking about reliability of small samples vs. large samples?

    Thanks....

  5. #5
    boxcar
    boxcar's Avatar Become A Pro!
    Join Date: 03-05-08
    Posts: 31

    Say you have a theory that is based on what fruit students pick at the cafeteria. You observe 100 different students, and they can pick one of several different fruits. You track them every day, and after a month you can see that different students show a preference for certain fruits. They don't always pick according to their preference; some weeks a banana kid may pick all strawberries.

    You develop a system to predict fruit choices of select children showing a bias, and under your system a child that chooses one fruit 60% of the time in a month exhibits a bias. Let's say you've found 15 such kids who show such a bias.

    Each kid picks only 7 fruits a week, and therefore the individual sample sizes are very small. If you base it on a month of data, that's only 30 events per child.

    You find that although the % of fruits chosen by of the individuals in a given month often deviates from their demonstrated preference, this tends to be evened out by some other children picking a larger number of their preferred fruit than usual. In fact this evening out is so effective that when added up, the percentage of time you are correct overall is consistently 60% each month.

    So the question is, with small individual and independent sample sizes, which when added together create a consistent aggregate pattern, do you have statistically significant data that can be expected to reliably predict future events?

    Assume for the purposes of the discussion that we aren't adding each months' data to the total, which of course would give a larger sample size each ensuing month to improve the reliability of the data.

    Any input is greatly appreciated. Thanks!

  6. #6
    Art Vandeleigh
    Art Vandeleigh's Avatar Become A Pro!
    Join Date: 12-31-06
    Posts: 1,494
    Betpoints: 459

    edit
    Last edited by Art Vandeleigh; 08-20-08 at 11:25 AM.

Top