An introduction to research

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ljump12
    SBR High Roller
    • 12-08-09
    • 109

    #1
    An introduction to research
    I'm writing this post to serve as an intro into using computers for sports betting. Programming isn't as hard as most people think, and the basic skills can be picked up on a weekend. This will be by no means an extensive resource, but will rather be a brief introduction. It is my belief that the best way to beat the books is with extensive research and backtesting. What is taught here will not give you the answers, there are no 20*play GOTY locks in this thread, only the tools that will allow you to succeed. Also note this is very much a work in progress, I will post new sections as I write them. If you have suggestions, would like to contribute etc. etc. just post!

    Sections:
    1) Intro to programming (Taught in Python)
    a) What you need to get started
    b) Basics of programming
    c) Basics of data input & output
    d) How to scrape the internet for data
    e) How to manipulate the data for excel
    2) Intro to excel
    a) How to load in data files
    b) What can be done in excel
    3) Intro to standard wagering ideas
    a) Arbitrage
    b) Kelly Criterion
    Python is one of many programming languages, and it allows us to work gather,manipulate and apply data. I believe Python is the best language for a beginner to learn becuase it reads like english, but is still extremely powerful.

    Section A) What you need to get started..

    Since you're reading this thread i'll assume you have a computer. Python is a platform independent scripting language, which means that it *should* run the same across different operating systems [Windows, Mac, Unix etc]. For this tutorial, i'm going to assume you have Mac/Linux becuase that is what I'm familiar with. However, it should be pretty easy to generalize to Windows.

    Downloading Python
    If you're on windows you will need to download Python and Idle [ http://www.python.org/download/ ]
    Get version 2.6.* -- don't get version 3. A lot has changed in version 3, and most old code is not supported, making it a pain in the ass. Trust me on this. Version 2.6.* is what you want.

    Good news, If you're on Mac or Linux, you probably already have python!
    Open up terminal [Mac users hit apple+space to bring up spotlight, and type in terminal].
    Type in "python -V" and press enter. It should tell you which version of python is installed. Even if it's not version 2.6.*, it will probably still do, as long as it's > 2.3 and < 3.0

    Writing Python Programs
    Python programs should be writted in a text editor, in a monospaced font...
    Windows Users: There's a good editor called "notepad++" google it. Alternatively when you download python it will come with an editor. You could use that...
    Mac Users: I like a program called "TextMate", though you need to pay for it. There's probably a free trial somewhere.

    Section B) Basics of Programming..

    Learning Python:
    I could type up a basic tutorial in python, but i'd be reinventing the wheel. John wrote a great introduction to programming that you can find here: http://books.google.com/books?id=aJQ...age&q=&f=false

    I'd suggest you read this through. Read at least the first 4 chapters. Spend a day and DO THE EXAMPLES. The only way to learn programming is by doing. It's really not hard stuff, it just takes some time to get the basics. Again, don't just read it or you will learn nothing. Take some time and practice practice practice. You can post questions or snippets of code in this thread if you're having problems. I'm sure I, or someone else can find and fix your problem.

    Section C) Basics of data input & output..

    If you have gotten to this point, you should already know the basics of python. You should know what an "if statement" is, what a "for loop" is, and how to print "Hello World!".

    In general the tasks we are trying to do with python will either be taking data from excel and manipulating/running tests on it, or getting data from the internet, and writing it to an excel file for easier access. We can do both with python! Excel takes in what is known as a "CSV" or comma separated file, and displays it in spreadsheet format, so all we have to do is have our python program output a file that is comma separated -- and we can load it right into excel.

    Let's start with a simple example. I have uploaded a .csv file to my website, it contains MLB game information for a single day. Download and save this file into the same directory that your python script will run from. If you open the file in excel, you will get a better idea of what is inside it. You'll find the file here: http://atbgreen.com/mlb_ex_1.csv

    [Opening and Reading a .csv File]
    PHP Code:
    ## Created on 4/1/10
    ## This example should shows how to open and read a .csv file.
    ##
    ## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
    ## .. as this scrpt
    
    ## Tell python to import the .csv module, becuase we will be reading a .csv
    import csv
    
    ## First we need to open the file
    mlb_file    = open("mlb_ex_1.csv","r") ## Open the MLB .csv file for reading
    
    ## Now we need to tell python to read it as a csv file
    ## We are opening the mlb_file as defined above, it is deliminated by commas,
    ## and our quote characted is a regular quote (")
    mlbReader = csv.reader(mlb_file, delimiter=',', quotechar='"') 
    
    ## Grab the first line, becuase it is the headers..
    headers = mlbReader.next()
    
    ## It's now time to iterate through the file row by row...
    for row in mlbReader:
        ## Let's try and only print the Over/Under Line, and the actual runs scored
        ## .. in the game. If you look at the .csv in excel you will see these are
        ## .. in the 6 & 7 columns. But since the computer starts counting at 0, 
        ## .. we would say they are in the 5th and 6th columns
        ou_line     = float(row[5]) ## This should be a float, becase it can be .5
         runs_scored = int(row[6])   ## This will be an int, becuase runs are integers
    
        print "The line was",ou_line,"and",runs_scored,"runs were scored"
    
    ## End of program 
    
    Save and run the code. I've commented it generously so you can tell exactly whats going on. It looks long, but it's only becuase i've tried to make it as clear as possible. If I wanted, i could compress the code into 3 lines -- but it's not nearly as easy to understand.

    [Opening and Reading a .csv File (in 3 lines)]
    PHP Code:
    import csv
    mlbReader = csv.reader(open("mlb_ex_1.csv"),delimiter=',',quotechar='"')
    for row in mlbReader: print "The line was",row[5],"and",row[6],"runs were scored" 
    
    Let's go a step further this time, and do some calculations with our file. Let's determine whether the game went over or under.
    [Opening and Reading a .csv File, and determining over or under]
    PHP Code:
    ## Created on 4/1/10
    ## This example should shows how to open and read a .csv file, and perform
    ## .. some simple calculations
    ##
    ## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
    ## .. as this scrpt
    
    ## Tell python to import the .csv module, becuase we will be reading a .csv
    import csv
    
    total_overs = 0     ## Initialize the total number of overs to 0
    total_unders = 0   ## Initialize the total number of unders to 0
    
    ## First we need to open the file
    mlb_file    = open("mlb_ex_1.csv","r") ## Open the MLB .csv file for reading
    
    ## Now we need to tell python to read it as a csv file
    ## We are opening the mlb_file as defined above, it is deliminated by commas,
    ## and our quote characted is a regular quote (")
    mlbReader = csv.reader(mlb_file, delimiter=',', quotechar='"') 
    
    ## Grab the first line, becuase it is the headers..
    headers = mlbReader.next()
    
    ## It's now time to iterate through the file row by row...
    for row in mlbReader:
        ## First we need to get the OU_Line, and runs scored out of the file.
        ou_line     = float(row[5]) ## This should be a float, becase it can be .5
        runs_scored = int(row[6])   ## This will be an int, becuase runs are integers
    
        ## Now lets compare the two with an if statement to see what happened:
        if ou_line < runs_scored:
            ou_result = "Under"
            total_unders += 1
        elif ou_line > runs_scored:
            ou_result = "Over"
            total_overs += 1
        else:
            ou_result = "Push"
    
        ## Calculate the percent of games that went over, and round it to 2 decimal places.
        over_under_percentage = round((total_overs / float(total_overs + total_unders)),2)*100
        ## Finally let's put it all together in one print statement
        print "The line was",ou_line,"and",runs_scored,"runs were scored, so the game went",ou_result
        ## END OF FOR LOOP
    
    print "There were",total_overs,"Overs"
    print "There were",total_unders,"Unders"
    print over_under_percentage,"percent of games went Over"
    
    ## End of program 
    
    That's really all there is to reading in a file. What you do after you have read the file in is completely up to you. All the columns are accessible in the "row" array, and can be accessed by asking for a position out of the array. Remember the position is always one less then its column number. For example, if you want the 7th column, you would do row[6].

    Let's move on to data output. Let's further expand on our old example, and say after we calculate whether the game went over or under, we want to write it to a new file. We want our new file to have three columns. Date, Teams, OverUnder. If we look in our sheet we will see that the date and teams are in columns 1 and 3 respectively. We will call our new file MLB_output.csv

    PHP Code:
    ## Created on 4/1/10
    ## This example should shows how to open and read a .csv file, and perform
    ## .. some simple calculations
    ##
    ## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
    ## .. as this scrpt
    
    ## Tell python to import the .csv module, becuase we will be reading a .csv
    import csv
    
    ## First we need to open both files
    mlb_file    = open("mlb_ex_1.csv","r") ## Open the MLB .csv file for reading
    output_file = open("MLB_output.csv","w") ## Open the output file for writing
    
    ## Now we need to tell python to read it as a csv file
    ## We are opening the mlb_file as defined above, it is deliminated by commas,
    ## and our quote characted is a regular quote (")
    mlbReader = csv.reader(mlb_file, delimiter=',', quotechar='"') 
    
    ## We'll do the same for our writer. We need to tell it where we will be writing
    ## .. to, and what kind of delimiters we want to use.
    mlbWriter = csv.writer(output_file, delimiter=',', quotechar='"')
    
    ## Grab the first line, becuase it is the headers..
    headers = mlbReader.next()
    
    ## It's now time to iterate through the file row by row...
    for row in mlbReader:
        ## First we need to get the OU_Line, and runs scored out of the file.
        ou_line     = float(row[5]) ## This should be a float, becase it can be .5
        runs_scored = int(row[6])   ## This will be an int, becuase runs are integers
    
        ## Now lets get the other information we need out (Date and Teams)
        date    = row[0]
        teams   = row[2]
    
        ## Now lets compare the two with an if statement to see what happened:
        if ou_line < runs_scored:
            ou_result = "Under"
        elif ou_line > runs_scored:
            ou_result = "Over"
        else:
            ou_result = "Push"
    
        ## Instead of printing here like we did before, we want to write to the file
        mlbWriter.writerow([date,teams,ou_result])
        ## END OF FOR LOOP
    
    output_file.close() ## Close the file after we have written everything
    print "The program has written everything!"
    ## End of program 
    
    Try running the program. After you do, you should see a new file has been created. This file will contain exactly what we expect

    Code:
    9/21/09,atl at nyn,Under
    9/21/09,bal at tor,Under
    9/21/09,bos at kca,Under
    9/21/09,chn at mil,Under
    9/21/09,min at cha,Over
    9/21/09,nya at ana,Over
    9/21/09,sdn at pit,Under
    9/21/09,sln at hou,Under
    9/21/09,tex at oak,Under
    That's really all there is to basic input and output of files!

    Section D) How to scrape the internet for data

    To be continued......

    Last edited by ljump12; 04-02-10, 11:56 AM.
  • MrX
    SBR MVP
    • 01-10-06
    • 1540

    #2
    Potentially one of the best threads on here. Good job.

    I would suggest throwing some mysql into the mix.
    Comment
    • IrishTim
      SBR Wise Guy
      • 07-23-09
      • 983

      #3
      Originally posted by MrX
      Potentially one of the best threads on here. Good job.

      I would suggest throwing some mysql into the mix.
      No doubt. Looking forward to see this one progress.
      Comment
      • Jule
        SBR Sharp
        • 04-02-10
        • 404

        #4
        Great info. :-)
        Comment
        • jessetopolski
          SBR High Roller
          • 12-20-09
          • 162

          #5
          intresting
          Comment
          • uva3021
            SBR Wise Guy
            • 03-01-07
            • 537

            #6
            great thread, looking forward to continue reading
            Comment
            • statnerds
              SBR MVP
              • 09-23-09
              • 4047

              #7
              excellent.

              thanks for the huge contribution. i will try my hand at this
              Comment
              • Wrecktangle
                SBR MVP
                • 03-01-09
                • 1524

                #8
                Nicely written article. Are there any subroutine libraries we can obtain so we don't have to write all our typically used routines from scratch?
                Comment
                • ljump12
                  SBR High Roller
                  • 12-08-09
                  • 109

                  #9
                  As far as Python and webscraping goes, your going to want to obtain BeautifulSoup and something called Mechanize. I intend to write about these in the next section. I may try to write up a commonly used betting function library, I have a bunch of functions lying around that i could probably scrape together.
                  Comment
                  • benjaminj78
                    SBR Rookie
                    • 03-13-10
                    • 15

                    #10
                    Python is a lifesaver when learning to code. This is the best reference anywhere,in regards to building the ultimate spreadsheet set-up. Thanks!
                    Comment
                    • Wrecktangle
                      SBR MVP
                      • 03-01-09
                      • 1524

                      #11
                      OK, is there something about Python that makes is easier or more forgiving to program in than other languages?

                      Why not use some variant of C? or VBA? or Java?
                      Comment
                      • ljump12
                        SBR High Roller
                        • 12-08-09
                        • 109

                        #12
                        Originally posted by Wrecktangle
                        OK, is there something about Python that makes is easier or more forgiving to program in than other languages?

                        Why not use some variant of C? or VBA? or Java?
                        Python is infinitely easier than C, Visual basic and Java. It will become more apparent the more you do with it, but take a look at the code to open and parse a CSV file. 3 readable lines. Show me a short elegant solution like that in C or Java. Don't get me wrong, compiled languages like C and Java have their place -- and are much much more efficient. But for this type of project, you will want to use a scripting language, (Python, Ruby, Perl etc...) I've just chosen Python because it's what I'm most familiar with.

                        Bottom line: I can write something in Python in 1/3 the time it takes me to write something in C or Java, with increased readability/maintainability, and without losing any functionality.
                        Comment
                        • MonkeyF0cker
                          SBR Posting Legend
                          • 06-12-07
                          • 12144

                          #13
                          It may be slightly faster to code parsing a CSV, but why would you be working much with CSV's anyway? Personally, I hate them. Generally, you'd be scraping directly into a DB of some sort anyway. At least, you probably should.
                          Comment
                          • MonkeyF0cker
                            SBR Posting Legend
                            • 06-12-07
                            • 12144

                            #14
                            Also, I prefer my efficiency to be on the executable side and not the coding side of my projects. However, for the purposes of this thread, I would agree that Python is far better as a tutorial on a more efficient language would be much more involved.
                            Comment
                            • Daniel
                              SBR Rookie
                              • 03-30-10
                              • 22

                              #15
                              Not that it really matters, but parsing a CSV file with C# is just a few lines of code too .

                              Code:
                              [COLOR=#000000][FONT=Times New Roman][FONT=Verdana][SIZE=2]Textreader tr = [/SIZE][SIZE=2][COLOR=#0000ff]new[/COLOR][/SIZE][SIZE=2] StreamReader(fileName);[/SIZE][/FONT][/FONT][/COLOR]
                              string line;
                              while ((line = tr.ReadLine()) != null)
                              {
                                 string[] brokenDown = line.Split(',');
                                 // Do what you want with an array of strings split by the commasign
                              }
                              What it boils down to is obviously language preference.. everyone has their own language that they feel the most comfortable with. The end result is still based more on the logic than which language you used.
                              Comment
                              • Wrecktangle
                                SBR MVP
                                • 03-01-09
                                • 1524

                                #16
                                I guess this sort of guess to my point. Granted Python is good for scraping, what does it have for me as far as modeling sports? If I'm a newbie, not wanting to learn two languages to do my programming (one to scrape, one or organize my modeling efforts, perhaps R) which one should I use? In my world, checked out scientific routines are important. Reinventing statistical wheels might be fun for some, but I rather directly model sports techniques.
                                Comment
                                • Daniel
                                  SBR Rookie
                                  • 03-30-10
                                  • 22

                                  #17
                                  Like I said, language is secondary. Choose the one you're comfortable using, be it Python or .net or whatever. For this project, it's still going to be the output that matters. As far as I know, no language is more suited to statistical analysis than another. At least not for a project such as this.

                                  I'm a .net developer, I would chose C# over Python every day of the week and twice on Sundays.. but for someone else the preferences is probably completely opposite. That doesn't mean that I can create something that more accurately analyzes data than the other guy.
                                  Comment
                                  • MonkeyF0cker
                                    SBR Posting Legend
                                    • 06-12-07
                                    • 12144

                                    #18
                                    It also depends on your environment. .NET won't do you much good in Linux. That said, I use .NET to code the majority of my handicapping apps.
                                    Comment
                                    • MonkeyF0cker
                                      SBR Posting Legend
                                      • 06-12-07
                                      • 12144

                                      #19
                                      If you'd like to concentrate on only one language, I'd honestly go with C#. There are a lot of things that I do with arrays and structs in my models that would be extremely cumbersome and inefficient in Python.
                                      Comment
                                      • ljump12
                                        SBR High Roller
                                        • 12-08-09
                                        • 109

                                        #20
                                        Originally posted by Wrecktangle
                                        I guess this sort of guess to my point. Granted Python is good for scraping, what does it have for me as far as modeling sports? If I'm a newbie, not wanting to learn two languages to do my programming (one to scrape, one or organize my modeling efforts, perhaps R) which one should I use? In my world, checked out scientific routines are important. Reinventing statistical wheels might be fun for some, but I rather directly model sports techniques.
                                        R is completely different from python, they do very different things, and both are very valuable to know. Python is more versitile than R in my opinion, and if you were only going to learn one language, python would be it imo. However, I don't want to waste more time arguing for python, it's mostly a personal choice -- you can use any language you'd like, the concepts taught here should still generally apply.
                                        Comment
                                        • sharpcat
                                          Restricted User
                                          • 12-19-09
                                          • 4516

                                          #21
                                          Ijump12, I think many are interested to see your write-up Please continue

                                          As far as this arguing about what language every body prefers to code in please start your own threads and let the man continue with his thread, lets allow this to be the educational thread it was meant be. I am sure everybody is capable of starting their own thread if they feel the need to prove that they are more intelligent and that their technique is better.
                                          Comment
                                          • ljump12
                                            SBR High Roller
                                            • 12-08-09
                                            • 109

                                            #22
                                            Originally posted by MonkeyF0cker
                                            If you'd like to concentrate on only one language, I'd honestly go with C#. There are a lot of things that I do with arrays and structs in my models that would be extremely cumbersome and inefficient in Python.
                                            I think efficiency is kind of a moot point at this stage. Were not dealing with things in handicapping that efficiency would not matter. I've written a python baseball simulator that processes millions of rows of PlayByPlay data, and it has no trouble.
                                            Comment
                                            • OMGRandyJackson
                                              SBR MVP
                                              • 02-07-10
                                              • 1680

                                              #23
                                              Once I get some free time, Im going to check out the python tutorial and get started on this. Cannot wait for your to continue with the topics!

                                              Thanks so much man!
                                              Comment
                                              • ljump12
                                                SBR High Roller
                                                • 12-08-09
                                                • 109

                                                #24
                                                Section D) How to scrape the internet for data

                                                One of the most important aspects of research is the data that you have. Without data, there can't be any model. Fortunately, most data is free -- Unfortunately, most data isn't immediately in the best computer parsable formats [like .csv, or .xml]. To get the data into formats we can use we will need to "scrape" websites for it.

                                                A couple "packages" have been created that will greatly improve our ability to scrape webpages. It can certaintly be done in python without them -- but they will make your life a whole lot easier:

                                                Mechanize - This will allow us to open webpages easily (http://wwwsearch.sourceforge.net/mechanize/)
                                                Beautiful Soup - This will allow us to parse apart the webpages (http://www.crummy.com/software/BeautifulSoup/)

                                                Installing Beautiful Soup is pretty easy, you can just put the http://www.crummy.com/software/Beaut...lSoup-3.0.0.py Beautiful soup python file in the same directory you are running your code from.

                                                Installing Mechanize is a little tougher, on a *nix machine, cd to the directory of where you downloaded it and extract it (tar -xzvf [filename]). Then cd into the extracted directory and install it by typing "sudo python setup.py install" It should install, you can post here if you have any problems. As far as windows goes, you may be on your own -- I can't imagine it's very tough, and there's probably a tutorial somewhere online.

                                                Now that the installation is out of the way, it's time to get down to business. I'll give you the basics here, and you should be able to refer to the documentation for more complicated examples. I'm going to assume you have a basic familiarity of html -- if you don't, you may want to search for a quick tutorial. Let's make our first example getting a list of today's injuries from statfox for MLB baseball:

                                                PHP Code:
                                                
                                                
                                                from BeautifulSoup import BeautifulSoup, SoupStrainer ## This tells python to use Beautiful Soup
                                                from mechanize import Browser   ## This tells python we want to use a browser (which is defined in mechanize)
                                                import re   ## This tells python that we will be using some regular expressions.
                                                            ## .. Regular expression allow us to search for a sequence of characters
                                                            ## .. within a larger string
                                                import time
                                                import datetime
                                                
                                                ## The first step is to create our browser..
                                                br = Browser()
                                                
                                                ## Now let's open the injuries page on statfox. This one line will open and retreive the html.
                                                response = br.open("http://www.sbrodds.com/StoryArchivesForm.aspx?ShortNameLeague=mlb&ArticleType=injury&l=3").read()
                                                
                                                ## Now we need to tell Beautiful Soup that we would like to search through the response.
                                                ## .. This next line will tell beautiful soup to only return links to the individual inuries.
                                                ## .. We know that all the links to the injuries have "ShortNameLeague=mlb&ArticleType=injury" 
                                                ## .. in their url, so we search for these links. Each of these links has a title that describes
                                                ## .. the injury which we will use in the next line.
                                                linksToInjuries = SoupStrainer('a', href=re.compile('ShortNameLeague=mlb&ArticleType=injury'))
                                                
                                                ## This will put the title of all links in the "linksToInjuries" into an array.
                                                ## We then call Set on our array to change the array to a "set" which by definition has no duplicates.
                                                injuryTitles = set([injuryPage['title'] for injuryPage in BeautifulSoup(response, parseOnlyThese=linksToInjuries)])
                                                
                                                
                                                ## Finally let's print all the injuries out that are for today's date.
                                                today = datetime.date.today()
                                                # the function strftime() (string-format time) produces nice formatting
                                                # All codes are detailed at http://www.python.org/doc/current/lib/module-time.html
                                                date =  today.strftime("%m/%d") 
                                                
                                                ## Now let's print out the injuries that we have.
                                                for title in injuryTitles:
                                                    ## See if the date is in the title, if it is: print it.
                                                    if re.search(date, title):
                                                        print title 
                                                
                                                It might seem like a lot at first, but it's not much code. Take it slow and use google when you dont know what a function does. Googling "python [some piece of code you dont understand]" will work magic. Ask here and i can further break down any slice of code.

                                                Sorry I haven't had much time -- If anyone can post an example of what kind of data they would like to be scraped, I will create one more example using both BeautifulSoup and Mechanize.
                                                Comment
                                                • pats3peat
                                                  SBR MVP
                                                  • 10-23-05
                                                  • 1163

                                                  #25
                                                  Got to love research, one o the best things in sports
                                                  Comment
                                                  • MadTiger
                                                    SBR MVP
                                                    • 04-19-09
                                                    • 2724

                                                    #26
                                                    Originally posted by Wrecktangle
                                                    ... although I suppose I could call them through Python, if it is allowed. ...
                                                    It is very allowed and possible http://www.omegahat.org/RSPython/

                                                    (ex-developer here. Ancient. These languages are new to me, but mixed languages have been my thing for a while.)
                                                    Comment
                                                    • sycoogtit
                                                      SBR Sharp
                                                      • 02-11-10
                                                      • 322

                                                      #27
                                                      Very nice thread ljump12. Your elegant python examples have convinced a perl programmer to spend a bit more time with python.

                                                      However, I'm conflicted. This is selfish of me, but as sports bettors we have to be selfish when it comes to this. If everyone knows about an edge, then it isn't an edge anymore. Do we really want to be giving everyone these step-by-step instructions on how to research betting trends? The information on how to program web scrapers is widely available, but putting it all down right here has made it significantly easier to learn how to apply it directly to our field.

                                                      I'm sure you thought of this before you started this thread -- I guess I'm curious what your thoughts are.
                                                      Comment
                                                      • ljump12
                                                        SBR High Roller
                                                        • 12-08-09
                                                        • 109

                                                        #28
                                                        Originally posted by sycoogtit
                                                        Very nice thread ljump12. Your elegant python examples have convinced a perl programmer to spend a bit more time with python.

                                                        However, I'm conflicted. This is selfish of me, but as sports bettors we have to be selfish when it comes to this. If everyone knows about an edge, then it isn't an edge anymore. Do we really want to be giving everyone these step-by-step instructions on how to research betting trends? The information on how to program web scrapers is widely available, but putting it all down right here has made it significantly easier to learn how to apply it directly to our field.

                                                        I'm sure you thought of this before you started this thread -- I guess I'm curious what your thoughts are.
                                                        This is a very valid concern. Here's the thing, and its kind of selfish on my part too. I'm not, and probably won't be a huge sports bettor. It's not that i cant be... It's something if I put 100% effort into i believe i could do well, but I don't really want to. Since im not doing it, i figure i may as well help other people. You may feel differently about what I'm doing, and I can totally respect that. I guess the bottom line is that, even given these tools and this "tutorial" (if you could call it that), not many are going to follow through with it, so i wouldn't be too worried.

                                                        Finally one of my biggest hopes for this thread is that it so sparks discussion. Please feel free to post on anything related..
                                                        Comment
                                                        • IrishTim
                                                          SBR Wise Guy
                                                          • 07-23-09
                                                          • 983

                                                          #29
                                                          I see where both of you guys are coming from, but I tend to agree with ljump here. I don't think we're going to have 100 clowns from players talk see this thread and all of the sudden go from looking for the 100 unit lock of the century to setting up web scrapers, churning out dbs with 20k samples, and firing away +EV plays into soft spots in the market by Friday. My guess would be that most of the people who have the patience (and intelligent quotient) to read, understand, and apply the lessons in this thread already know how to do this type of programming or who have contacts who they share/get help from.

                                                          As long as you aren't attaching databases with +EV models to each post, I think everyone is going to be okay.
                                                          Comment
                                                          • romanowski
                                                            SBR Hustler
                                                            • 06-14-06
                                                            • 85

                                                            #30
                                                            most people are too lazy to do any of this, I wouldnt worry about losing any edge
                                                            Comment
                                                            • frankzig
                                                              SBR MVP
                                                              • 10-26-09
                                                              • 2263

                                                              #31
                                                              this is nice
                                                              Comment
                                                              • MonkeyF0cker
                                                                SBR Posting Legend
                                                                • 06-12-07
                                                                • 12144

                                                                #32
                                                                Relatively similar? LOL. I hope you're joking.

                                                                Some people have only box scores, some have play by play data, some have pitch by pitch data, some people have linked line history tables, some people have closing number columns, some have linked player tables with keys, some have individual tables for each game, some people perform simple system or prop queries, some perform a series of queries to populate and process a model, etc., etc., etc., etc., etc. Not to mention, there are probably more than 3 billion ways that one can go about doing the exact same thing.

                                                                AGAIN, IT ALL DEPENDS ON YOUR DATASET AND HOW YOU PLAN ON USING YOUR DATA!!!!!!!

                                                                Anyone whose profession is in data warehousing should be able to grasp this simple concept the first time they are told. However, they certainly shouldn't need to be told this in the first place.
                                                                Last edited by MonkeyF0cker; 04-16-10, 11:13 PM.
                                                                Comment
                                                                • durito
                                                                  SBR Posting Legend
                                                                  • 07-03-06
                                                                  • 13173

                                                                  #33
                                                                  Yea, just ignore the odds, that should work out perfectly.
                                                                  Comment
                                                                  • Wrecktangle
                                                                    SBR MVP
                                                                    • 03-01-09
                                                                    • 1524

                                                                    #34
                                                                    I'm always struck by how hard it can be to express yourself in print, and the fact that we all use differing terms to label the same items. I'm not a data base guy but data dictionary "thingies" are important even in my simplistic world. I would like to see us stay away from the Players Talk way of solving differences of opinion here in the Tank, however.

                                                                    I keep saying this to no observable progress: I'd like to see a group form where the interest is sharing checked out data sets. I spend way too much time cross checking data and way too little time on model building and analysis; especially the analysis.
                                                                    Comment
                                                                    • MonkeyF0cker
                                                                      SBR Posting Legend
                                                                      • 06-12-07
                                                                      • 12144

                                                                      #35
                                                                      The reason his statement was confusing is because he wasn't using the proper terminology, Wrecktangle. If someone attacks my integrity in here, I'm certainly going to prove my point. If you don't design your data tables to coincide with your end product, you'll likely create a ton of unnecessary work for yourself and inefficiencies in the modelling phase. It can make your queries a nightmare to code and process.

                                                                      As far as sharing datasets, I have no interest in that. I do everything programmatically and I think I have far more reliable data than the vast majority of posters here. I really doubt I'd get any desirable reciprocation for my work. Not to mention, I'm not one to trust other people's work when it comes to these things. If someone gave me a set of data, the first thing I'd do is verify its integrity. So it would be a completely unproductive process for me.
                                                                      Comment
                                                                      SBR Contests
                                                                      Collapse
                                                                      Top-Rated US Sportsbooks
                                                                      Collapse
                                                                      Working...