Interesting article\alforithm on classifying winners

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • podonne
    SBR High Roller
    • 07-01-11
    • 104

    #1
    Interesting article\alforithm on classifying winners
    I read an interesting article in last week's Economist about a method of classifying artworks by looking at a large set of quantifiable factors. As usual when I read things like this, I was thinking in my mind how to apply this to sports betting, namely classifing winners.

    From the article:
    All told, the computer identified 4,027 different numerical descriptors. Once their values had been established for each of the 513 artworks that had been fed into it, it was ready to do the analysis.
    Dr Shamir’s aim was to look for quantifiable ways of distinguishing between the work of different artists. If such things could be established, it might make the task of deciding who painted what a little easier. Such decisions matter because, even excluding deliberate forgeries, there are many paintings in existence that cannot conclusively be attributed to a master rather than his pupils, or that may be honestly made copies whose provenance is now lost.
    To look for such distinguishing features, Dr Shamir programmed the computer to use a statistical method that scores the strength of the distance between the values of two or more descriptors for each pair of artists. As a result, he was able to rank each of the 4,027 descriptors by how useful it was at discriminating between artists.
    Src: http://www.economist.com/node/21524699
    Replace the word "artwork" with "teams in games" and "artists" with "winners" and you can see why this was interesting. Through a little research I found the original paper and I got very excited when I read that he only used 513 artworks total (that's only about 256 games) and got these results:

    Each classifier was tested 50 times such that in each run the
    images were randomly allocated for training and test sets. The automatic classification between the paintings of Van
    Gogh and Pollock using low-level image content descriptors was accurate in just 92% of the cases, while the
    accuracy of the two-way classifiers between Pollock and Monet or Pollock and Renoir was 100% in both cases [38].
    The classification accuracy was also perfect when classifying Pollock and other painters such as Dali.
    Src: http://vfacstaff.ltu.edu/lshamir/pub...k%20_final.pdf
    Pretty good results. The source code for the algorithm he used is called WND-CHARM and was originally written for classifying biological images. Available here: http://www.scfbm.org/content/3/1/13

    All I ask is that if you make something of this you'll share your results!
  • brettd
    SBR High Roller
    • 01-25-10
    • 229

    #2
    Automatic "classification"? "Training" sets and "test" sets?

    Sounds like discriminant analysis. Nothing new there, just applied to some interesting subject matter.

    Comment
    • Jontheman
      SBR High Roller
      • 09-09-08
      • 139

      #3
      Van Gogh and Pollock are very different and anyone off the street with no knowledge, if given 5 minutes training on styles and distinguishing features, could achieve a 100% success rate in assigning a work to one or the other. Don't know why you're impressed that a computer could only manage 92%
      Comment
      • podonne
        SBR High Roller
        • 07-01-11
        • 104

        #4
        Originally posted by Jontheman
        Van Gogh and Pollock are very different and anyone off the street with no knowledge, if given 5 minutes training on styles and distinguishing features, could achieve a 100% success rate in assigning a work to one or the other. Don't know why you're impressed that a computer could only manage 92%
        One word, scalability. Also, you should read the article. The author found that Van Gogh and Pollock were more similar than other artists typically associated like Monet or Renoir:

        Surprisingly, the values of 19 of the 20 most informative descriptors showed dramatically higher similarities between Van Gogh (left below) and Pollock (right) than between Van Gogh and painters such as Monet and Renoir, who conventional art criticism would think more closely related to Van Gogh’s oeuvre than Pollock’s is. (Dalí and Ernst, by contrast, were farther apart then expected.)
        What is interesting, according to Dr Shamir, is that no single feature makes Pollock’s artistic style similar to Van Gogh’s. Instead, the connection is based on a broad set of image-content descriptors which reflect many aspects of the two artists’ styles, including a shared preference for low-level textures and shapes, and similarities in the ways they employed lines and edges.
        Comment
        • Jontheman
          SBR High Roller
          • 09-09-08
          • 139

          #5
          I don't think that invalidates my point. To a computer they may have similarities, but they are easy to tell apart by any human, even one with an untrained eye. In other words it confirms that computers are a LONG way behind in this aspects.

          Why would you want to scale something that is currently inferior to every human at making sense of information?
          Comment
          • podonne
            SBR High Roller
            • 07-01-11
            • 104

            #6
            Originally posted by Jontheman
            I don't think that invalidates my point. To a computer they may have similarities, but they are easy to tell apart by any human, even one with an untrained eye. In other words it confirms that computers are a LONG way behind in this aspects.

            Why would you want to scale something that is currently inferior to every human at making sense of information?
            Well, we're talking about an application to sports betting, and I think I'm safe in saying that its not easy for a human to distinguish between a team that will win and a team that will lose. Its not to hard to build a computer program that is better than 50% of humans at picking winners (and that's generous).

            Second, there are a finite number of games that a human can consider, even a human that can distinguish between winners and losers. If I required you to calculate 8,000+ numbers (4,000 for each team) for every matchup you would be hard pressed to handicap a day's worth of NCAA basketball. A computer can do it in a mater of minutes.

            Third, the whole idea is to find things that distinguish between winners and losers that are NOT easily distinguishable to the average person. Anything that's easy to see will already be included in the line. Its only the hidden things (like the subtle yet significant combinations of thousands of factors that make Van Gogh more similar to Pollak than Monet) that will make you money.
            Comment
            • Peregrine Stoop
              SBR Wise Guy
              • 10-23-09
              • 869

              #7
              alforithm?
              Comment
              • Peregrine Stoop
                SBR Wise Guy
                • 10-23-09
                • 869

                #8
                simple models make better predictions than complex ones

                simple models with human input make even better predictions
                Comment
                • chunk
                  SBR Wise Guy
                  • 02-08-11
                  • 808

                  #9
                  Smart bird there.
                  Comment
                  • Wrecktangle
                    SBR MVP
                    • 03-01-09
                    • 1524

                    #10
                    Originally posted by podonne

                    All I ask is that if you make something of this you'll share your results!
                    Right, count on it.
                    Comment
                    • vyomguy
                      SBR Hall of Famer
                      • 12-08-09
                      • 5794

                      #11
                      Originally posted by Peregrine Stoop
                      simple models make better predictions than complex ones simple models with human input make even better predictions
                      This. You need to have human input to the models to have success.
                      Comment
                      • FuzzyMathGuru
                        Restricted User
                        • 07-27-11
                        • 3

                        #12
                        These models never work. They are always history facing, meaning they make sense of what has happened in the past to predict the future, but they are NEVER graded on their predictions. They never apply their model to existing data for verification. When they use 1990-2000 data to predict what will happen in 2001, then you compare against the results, you'll see no advantage. Pass.
                        Comment
                        • vyomguy
                          SBR Hall of Famer
                          • 12-08-09
                          • 5794

                          #13
                          the biggest problem is to extract the features for training data.
                          Comment
                          • Peregrine Stoop
                            SBR Wise Guy
                            • 10-23-09
                            • 869

                            #14
                            Originally posted by FuzzyMathGuru
                            These models never work. They are always history facing, meaning they make sense of what has happened in the past to predict the future, but they are NEVER graded on their predictions. They never apply their model to existing data for verification. When they use 1990-2000 data to predict what will happen in 2001, then you compare against the results, you'll see no advantage. Pass.
                            do you know every model being used?
                            Comment
                            SBR Contests
                            Collapse
                            Top-Rated US Sportsbooks
                            Collapse
                            Working...