Does Data Mining Get A Bad Rap?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • HedgeHog
    SBR Posting Legend
    • 09-11-07
    • 10128

    #1
    Does Data Mining Get A Bad Rap?
    It seems the catchall phrase in stats today is data mining. If you think you have a good logical angle; your first instinct is to research its past results. And for better or worse, this is data mining. No matter how good the results you've uncovered, it's tainted (you data mined).

    Sure sometimes data mining leads to a false random conclusion. But it can also lead you to real gems. Data mining is not necessarily a bad thing. Any other thoughts?
  • Data
    SBR MVP
    • 11-27-07
    • 2236

    #2
    HedgeHog, what you described here is not data mining. Data mining is frowned upon because that process does not start with a logical angle but rather a random observation such as "Thursday games" or "coming off a 2-7 points loss against a Conference rival".
    Comment
    • Arnold
      SBR Wise Guy
      • 12-17-07
      • 906

      #3
      Companies pay big bucks for data mining professionals. Would someone pay for it if it was useless?

      Data mining from Wiki:

      Data mining is the principle of sorting through large amounts of data and picking out relevant information. It is usually used by business intelligence organizations, and financial analysts, but it is increasingly used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods. It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data"[1] and "the science of extracting useful information from large data sets or databases
      Comment
      • 20Four7
        SBR Hall of Famer
        • 04-08-07
        • 6703

        #4
        HH, what you are doing is back testing a theory. You have a theory (angle) and you want to see how this performed over this year last year, last 3 years. That is perfectly valid, but is still no guarantee it will perform well in the future but is likely to do so.
        Comment
        • pico
          BARRELED IN @ SBR!
          • 04-05-07
          • 27321

          #5
          I read the book, Fooled by Randomness, by Nassim Taleb. I suggest all of you to read it as well. In the book he proposed a simple probabilty question: What is the odds of meeting a random person on the street who has the same birthday as you...the odds is 1 in 365.25. Now in a room with 23 people, what is the odds that 2 people in that room has the same birthday? the odds is about 50%.

          this is why data mining is frown upon, you're bound to come up with some random correlation that only exist in the dataset you're mining from.
          Comment
          • HedgeHog
            SBR Posting Legend
            • 09-11-07
            • 10128

            #6
            I think data mining needs to be properly defined. I'm talking about all research in general--and I believe it all falls under data mining.
            Comment
            • durito
              SBR Posting Legend
              • 07-03-06
              • 13173

              #7
              .
              Last edited by durito; 01-17-09, 02:59 PM.
              Comment
              SBR Contests
              Collapse
              Top-Rated US Sportsbooks
              Collapse
              Working...