Developing a database

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Jericholic
    SBR MVP
    • 02-15-10
    • 3099

    #1
    Developing a database
    I recently decided to develop my own database. My question is what type of information does everyone find the most helpful? For instance, for basketball I would obviously want to incorporate wins straight up and against the spread, but what other stats should I include, FG %, PPG, FT %, etc? Any input given would be extremely helpful. Thank you in advance.
  • Peep
    SBR MVP
    • 06-23-08
    • 2295

    #2
    I have been doing databases for gambling for ten years now.

    Two thoughts come to mind.

    1) Download as much data as you can, it is easier to put it all in at first together. You can always not use columns.
    2) Think about in advance what sorts of questions you want to ask your database. How many games back do you want to ask it for?
    Comment
    • dodger33
      SBR MVP
      • 08-14-09
      • 3962

      #3
      Does the team have any white guys on it, if so discount them.
      Comment
      • runt23
        SBR High Roller
        • 02-09-10
        • 134

        #4
        Originally posted by dodger33
        Does the team have any white guys on it, if so discount them.
        hahahaha oh man...

        seriously...
        Comment
        • runnershane14
          SBR Wise Guy
          • 07-23-07
          • 803

          #5
          assist/turnover ratio
          Comment
          • Justin7
            SBR Hall of Famer
            • 07-31-06
            • 8577

            #6
            Peep nailed it. Grab everything you can.
            Comment
            • JohnAnthony
              SBR Hall of Famer
              • 04-30-09
              • 5110

              #7
              Guys, can I add my own question here?

              Which is the best place to get this kind of database going? where can I get this info in a form that will be easily imported to Excel. How would I go into creating and organizing the spreadsheet "foundation"?
              "I have never seen a wild thing feel sorry for itself. A little bird will fall dead, frozen from a bough, without ever having felt sorry for itself."

              - D.H. Lawrence
              Comment
              • JohnAnthony
                SBR Hall of Famer
                • 04-30-09
                • 5110

                #8
                LT or anyone? ^
                "I have never seen a wild thing feel sorry for itself. A little bird will fall dead, frozen from a bough, without ever having felt sorry for itself."

                - D.H. Lawrence
                Comment
                • Peep
                  SBR MVP
                  • 06-23-08
                  • 2295

                  #9
                  I copy and paste out of Covers mostly. Or out of Don Best or Sportsoptions. Sometimes you have to reformat.
                  Comment
                  • Diesel79
                    SBR MVP
                    • 11-27-08
                    • 1001

                    #10
                    My NBA database includes columns:

                    Date
                    Time
                    Unique serial number for every game
                    Team name
                    Home or away game?
                    Opposite team name
                    How many days since last game?
                    B2B game or 4-th game in 5 nights?
                    Is it 1-st game after long roadtrip or 1-st game after long homeseries
                    Spread
                    Total
                    Points scored
                    Points against
                    OT?
                    SU result
                    ATS result
                    O/U result
                    Win % before game (all games)
                    Win % before game (home and away games separately)
                    How many SU wins in last 5 games
                    Pts per game (season average)
                    Pts per game opponent (season average)
                    Pts per game (home and away separately)
                    Pts per game opponent (home and away separately)
                    Last 2 games how many points scored
                    Last 2 games how many opponent scored
                    Last game between teams this season:
                    - points scored
                    - points scored by opponent
                    - SU result
                    - ATS result
                    - O/U result
                    - home/away game
                    - Date

                    Pinny opening and closing line for spread and O/U
                    Pinny 2-nd half line for spread and O/U

                    Last game stats - basically all stats (pts, reb, ast, FG%....... everything)
                    Last game opponent stats - same things
                    Starters scored last game
                    Bench scoring last game

                    I also create gamereport or something which shows me how big of a lead (or how much down) team have after every 2 minutes of the game..... so with 48 minutes I get 24 numbers. it gives me much better understanding how dominant or how bad certain team is.

                    Im trying to put in individual player stats too but I havent figured out how to do it yet

                    I hope it helps you with your own database

                    I
                    Comment
                    • runt23
                      SBR High Roller
                      • 02-09-10
                      • 134

                      #11
                      great post diesel. I don't bet on basketball yet but this post helps me out in other ways too. Thanks!
                      Comment
                      • Wrecktangle
                        SBR MVP
                        • 03-01-09
                        • 1524

                        #12
                        Covers has a feature called betgraph on each game which will show the ebb/flow of a sides game. Their totals part is crap and should be fixed, but you can characterized each game using this feature.

                        On NBA I scrape everything they have in the boxscore and use most of it in some fashion. I've always felt that if folks could get together and "cross scrape" much of the onerous work of db "management" could be mitigated. Never seems to happen though.

                        Db work is the single most important thing you can do as everything derives from it, and it is the most dull, mind numbing thing you can do.
                        Comment
                        • roasthawg
                          SBR MVP
                          • 11-09-07
                          • 2990

                          #13
                          Originally posted by Diesel79
                          My NBA database includes columns:

                          Date
                          Time
                          Unique serial number for every game
                          Team name
                          Home or away game?
                          Opposite team name
                          How many days since last game?
                          B2B game or 4-th game in 5 nights?
                          Is it 1-st game after long roadtrip or 1-st game after long homeseries
                          Spread
                          Total
                          Points scored
                          Points against
                          OT?
                          SU result
                          ATS result
                          O/U result
                          Win % before game (all games)
                          Win % before game (home and away games separately)
                          How many SU wins in last 5 games
                          Pts per game (season average)
                          Pts per game opponent (season average)
                          Pts per game (home and away separately)
                          Pts per game opponent (home and away separately)
                          Last 2 games how many points scored
                          Last 2 games how many opponent scored
                          Last game between teams this season:
                          - points scored
                          - points scored by opponent
                          - SU result
                          - ATS result
                          - O/U result
                          - home/away game
                          - Date

                          Pinny opening and closing line for spread and O/U
                          Pinny 2-nd half line for spread and O/U

                          Last game stats - basically all stats (pts, reb, ast, FG%....... everything)
                          Last game opponent stats - same things
                          Starters scored last game
                          Bench scoring last game

                          I also create gamereport or something which shows me how big of a lead (or how much down) team have after every 2 minutes of the game..... so with 48 minutes I get 24 numbers. it gives me much better understanding how dominant or how bad certain team is.

                          Im trying to put in individual player stats too but I havent figured out how to do it yet

                          I hope it helps you with your own database

                          I
                          Nice post... that's pretty much everything you could ask for in a db.
                          Comment
                          • Peep
                            SBR MVP
                            • 06-23-08
                            • 2295

                            #14
                            Nice post... that's pretty much everything you could ask for in a db.
                            Not really. But it is certainly a great list, thank you for the post Diesel.

                            There are a lot of "derived fields" that can be created as well.

                            Projected fields are also interesting. If a team scores 100, 100, 120 what would be their projected score?

                            To me that is the fun of doing a database. Set it up so you can ask the kinds of questions you want answered. As in

                            1) When will a 1st Q be most likely to go over? Under?
                            2) Which half is the best bet with this matchup?
                            3) Is the moneyline or the points the best bet?
                            4) What factors go together here? What could I parlay?
                            Comment
                            • Jericholic
                              SBR MVP
                              • 02-15-10
                              • 3099

                              #15
                              Wow, great answers from everyone. I had never thought to discount teams with too many white guys. Probably should have thought of that.
                              Comment
                              • WileOut
                                SBR MVP
                                • 02-04-07
                                • 3844

                                #16
                                How would these databases be able to beat the lines that the books pay pro line-makers to come up with? How confident are you that your line is more accurate than the line the book line-makers come up with, the guys who do this professionally all day every day, with almost unlimited resources?

                                I just don't see how many people can come up with a better line than the bookmakers put out, but obviously many groups are able to do this and make gobs of money.

                                But I don't see how the average joe statistician can do it by just using stats that are readily available to every person in the world. The pros surely have some secret stats that they use that aren't readily known by many people outside the business. Or they know how to use the same stats you do, but know exactly how much weight to put on each one.
                                Last edited by WileOut; 02-17-10, 09:54 AM.
                                Comment
                                • runt23
                                  SBR High Roller
                                  • 02-09-10
                                  • 134

                                  #17
                                  Good post WileOut.. it's true, its tough to determine how much weight to put on each stat.. I never used to look that deep into stats (which meant losing money eventually), but I have found some "patterns" - even though patterns don't win you money.. I shouldn't say patterns.. but you guys know what I mean.. i.e. teams with a better home record are more likely to win a home game

                                  If I can interrupt, how do you guys go about keeping your database up to date? and what program do you use? excel? or something else.

                                  thanks
                                  Comment
                                  • runnershane14
                                    SBR Wise Guy
                                    • 07-23-07
                                    • 803

                                    #18
                                    Keeping them up to date is the hard part for me. I can compile all the data in the world but going out each day to get additional data is the tedious part.
                                    Comment
                                    • roasthawg
                                      SBR MVP
                                      • 11-09-07
                                      • 2990

                                      #19
                                      Originally posted by WileOut
                                      How would these databases be able to beat the lines that the books pay pro line-makers to come up with? How confident are you that your line is more accurate than the line the book line-makers come up with, the guys who do this professionally all day every day, with almost unlimited resources?

                                      I just don't see how many people can come up with a better line than the bookmakers put out, but obviously many groups are able to do this and make gobs of money.

                                      But I don't see how the average joe statistician can do it by just using stats that are readily available to every person in the world. The pros surely have some secret stats that they use that aren't readily known by many people outside the business. Or they know how to use the same stats you do, but know exactly how much weight to put on each one.
                                      I ask myself this question a lot... there's no doubt about it that I'm able to identify a bunch of +ev plays everyday. But WHY are these plays available is the question... I am confident that I am NOT outsmarting the books so the question becomes why are these +ev plays available? The answer I think is that they use public betting tendencies to their advantage.
                                      Comment
                                      • MrX
                                        SBR MVP
                                        • 01-10-06
                                        • 1540

                                        #20
                                        Originally posted by WileOut
                                        How would these databases be able to beat the lines that the books pay pro line-makers to come up with? How confident are you that your line is more accurate than the line the book line-makers come up with, the guys who do this professionally all day every day, with almost unlimited resources?

                                        I just don't see how many people can come up with a better line than the bookmakers put out, but obviously many groups are able to do this and make gobs of money.

                                        But I don't see how the average joe statistician can do it by just using stats that are readily available to every person in the world. The pros surely have some secret stats that they use that aren't readily known by many people outside the business. Or they know how to use the same stats you do, but know exactly how much weight to put on each one.
                                        You're either underestimating the complexities of a good model, or overestimating the ability of bookmakers (probably both).

                                        I've spent countless hours developing a MLB model. It was good enough to be profitable years ago, but I've never even come close to what I'd consider the perfect model. There are layers upon layers of complexity available to pursue. For every new aspect I add to the model, I become aware of several new concepts that I'd like to integrate. Just properly regressing player projections to the mean can become a very complex task for someone trying to do a good job of it. Strength of competition, park factors, righty/lefty interactions, aging, weather, all of these things can be tricky to do well.

                                        There is much more money to be made betting than a book would ever pay a linesman, so it makes no sense that the most talented line makers would be working for the books. I don't have a lot of insight as far as the inner workings of sportsbooks are concerned, but I'm very confident that the methods used to set the lines are far less sophisticated than anything a serious gambler would use.
                                        Last edited by MrX; 02-17-10, 09:03 PM.
                                        Comment
                                        • MrX
                                          SBR MVP
                                          • 01-10-06
                                          • 1540

                                          #21
                                          Originally posted by WileOut
                                          The pros surely have some secret stats that they use that aren't readily known by many people outside the business. Or they know how to use the same stats you do, but know exactly how much weight to put on each one.
                                          Though there have surely been cases of the former, I think that overwhelmingly the same base statistics available to the public are being used, they're just being used better.
                                          Comment
                                          • MrX
                                            SBR MVP
                                            • 01-10-06
                                            • 1540

                                            #22
                                            Originally posted by runt23
                                            If I can interrupt, how do you guys go about keeping your database up to date? and what program do you use? excel? or something else.
                                            I have the scraping of html boxscores, etc, and the loading of those stats into my database automated as part of my model. The day I automated all of this was a great leap in my quality of life!

                                            Database: mysql
                                            Some research: Excel
                                            Modeling: Visual Basic
                                            Comment
                                            • runt23
                                              SBR High Roller
                                              • 02-09-10
                                              • 134

                                              #23
                                              Thanks MrX! I know a bit of all of those (best at excel), so hopefully I will be able to put something together! Thanks again
                                              Comment
                                              • Peep
                                                SBR MVP
                                                • 06-23-08
                                                • 2295

                                                #24
                                                I update yearly.
                                                Comment
                                                • Wrecktangle
                                                  SBR MVP
                                                  • 03-01-09
                                                  • 1524

                                                  #25
                                                  WileOuts' point of folks who just work dbs (not true modelers) not finding situations where they don't have an advantage is not the case. I guess I should rewrite this where I don't have a double negative --> i.e. filtering dbs for situations where combining factors in large dbs does work. I have seen single factors that work and show enough advantage to overcome the vig. This is really angle handicapping, and it seems the market tends to price these things into the line as they become widely known. Maybe the most famous of these was Monday Night home Dogs in the NFL; worked for decades. Didn't happen often though.

                                                  Modeling takes much more skill, and can turn up situations with greater advantage.
                                                  Comment
                                                  • WileOut
                                                    SBR MVP
                                                    • 02-04-07
                                                    • 3844

                                                    #26
                                                    Originally posted by Wrecktangle
                                                    WileOuts' point of folks who just work dbs (not true modelers) not finding situations where they don't have an advantage is not the case. I guess I should rewrite this where I don't have a double negative --> i.e. filtering dbs for situations where combining factors in large dbs does work. I have seen single factors that work and show enough advantage to overcome the vig. This is really angle handicapping, and it seems the market tends to price these things into the line as they become widely known. Maybe the most famous of these was Monday Night home Dogs in the NFL; worked for decades. Didn't happen often though.

                                                    Modeling takes much more skill, and can turn up situations with greater advantage.
                                                    I think you overestimated my understanding of what it is you speak of

                                                    I don't know the difference between "folks who just work databases" and "true modelers".

                                                    This is why I stay out of the Think Tank for the most part and I don't mean to get off topic here.

                                                    My point was that the book linemakers have a big advantage over Joe "I want to learn how to come up with a better line than Bookmaker has" Gambler. I would think that most will never succeed in coming up with a better line or even one that is just as good as the book linemakers. Therefore maybe the novice gambler should look to other ways of beating the book, since the average gambler will never be able to come up with a line that beats the linemaker at the book. But I think this is probably another topic for another thread.

                                                    These groups that I hear about like the computer group, etc. Do these guys simply come up with the same line the book linemakers come up with and then bet against the public lean? Or do they really come up with a better line? It must take a genius to actually come up with a better line than the books do. It must take years of trial and error to learn how to weigh the different stats to actually come up with a line that beats the bookmaker.

                                                    Then again I guess one can attack softer markets and that makes it easier to come up with the better line.
                                                    Last edited by WileOut; 02-18-10, 07:39 AM.
                                                    Comment
                                                    • Peep
                                                      SBR MVP
                                                      • 06-23-08
                                                      • 2295

                                                      #27
                                                      I have never set up a database with the goal of setting a better line than the oddsmaker. I didn't think I could do that. If someone can, they should own it all at some point in time.

                                                      I thought I could set up a database that could get better derivitive numbers than the sportsbook widely post. I still think my database can do that.

                                                      Please note that I said "than the sportsbooks widely post". This is a different statement than better derivitives than the sportsbooks can get. They may well want to post an "off number" if they feel it will increase their profit. This is especially true of props. I like to know what these numbers are.
                                                      Comment
                                                      • runt23
                                                        SBR High Roller
                                                        • 02-09-10
                                                        • 134

                                                        #28
                                                        I like to look at historical data to help me pick who I think is going to win. A lot of the time I don't even look at the line until I have determined who I want to bet on.

                                                        When you have a "system" in place that generates a good winning record, it shouldnt matter too much about what the odds are if the chances of you winning at greater then you losing. That's what I think though..

                                                        Obviously it's not as worth it to bet huge favorites with low odds which will minimize your profits and if you lose, it would take longer to get back.
                                                        Comment
                                                        SBR Contests
                                                        Collapse
                                                        Top-Rated US Sportsbooks
                                                        Collapse
                                                        Working...