Database suggestions?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • so im zach
    SBR Wise Guy
    • 01-07-09
    • 585

    #1
    Database suggestions?
    Planning on web scraping NCAA basketball data, but I'm unsure which database solutions I should be looking into. Would like multiple filter/sorting options for teams, conferences, location, etc.

    What should I be looking into? Any books I should be reading (already have a couple data warehousing and statistical analysis, but any you guys recommend would be awesome)?

    Also, how happy are you with your automated web scraping solution? Any issues you've run into or tips you feel like sharing?

    I definitely appreciate whatever knowledge you guys can share on databases or issues you ran into when setting up your own.

    Thanks a lot.
  • Peep
    SBR MVP
    • 06-23-08
    • 2295

    #2
    I use Excel for working with and Access for viewing.

    I don't know how to scrape, so I just do copy and paste with the data I collect.
    Comment
    • Wrecktangle
      SBR MVP
      • 03-01-09
      • 1524

      #3
      Access OK, SQL better
      Comment
      • MrX
        SBR MVP
        • 01-10-06
        • 1540

        #4
        SQL is pretty much a slam dunk. Free, powerful, and flexible.
        Comment
        • durito
          SBR Posting Legend
          • 07-03-06
          • 13173

          #5
          I use mysql
          Comment
          • xyz
            SBR Wise Guy
            • 02-14-08
            • 521

            #6
            I want to get an idea about the size of the data. Is it in the MBs, GBs, or TBs? If you can scrape, store, query, and analyze as much data as you want, how much better would your models be? Thanks for the info.
            Comment
            • MrX
              SBR MVP
              • 01-10-06
              • 1540

              #7
              Originally posted by xyz
              I want to get an idea about the size of the data. Is it in the MBs, GBs, or TBs? If you can scrape, store, query, and analyze as much data as you want, how much better would your models be? Thanks for the info.
              MBs, and I specialize in the most data-intensive sport using data down to the individual pitch level, so I don't know how anyone's going to get into the TBs!

              In theory, the perfect model would use all of the pertinent data available. More tends to be better, but more can also lead to overfitting, too much complexity, excessive computing time, headaches, nausea, etc.
              Comment
              • Indecent
                SBR Wise Guy
                • 09-08-09
                • 758

                #8
                PostgreSQL here
                Comment
                SBR Contests
                Collapse
                Top-Rated US Sportsbooks
                Collapse
                Working...