1. #1
    ljump12
    ljump12's Avatar
    Join Date: 12-08-09
    Posts: 108
    Betpoints: 258

    An introduction to research

    I'm writing this post to serve as an intro into using computers for sports betting. Programming isn't as hard as most people think, and the basic skills can be picked up on a weekend. This will be by no means an extensive resource, but will rather be a brief introduction. It is my belief that the best way to beat the books is with extensive research and backtesting. What is taught here will not give you the answers, there are no 20*play GOTY locks in this thread, only the tools that will allow you to succeed. Also note this is very much a work in progress, I will post new sections as I write them. If you have suggestions, would like to contribute etc. etc. just post!

    Sections:
    1) Intro to programming (Taught in Python)
    a) What you need to get started
    b) Basics of programming
    c) Basics of data input & output
    d) How to scrape the internet for data
    e) How to manipulate the data for excel
    2) Intro to excel
    a) How to load in data files
    b) What can be done in excel
    3) Intro to standard wagering ideas
    a) Arbitrage
    b) Kelly Criterion
    Python is one of many programming languages, and it allows us to work gather,manipulate and apply data. I believe Python is the best language for a beginner to learn becuase it reads like english, but is still extremely powerful.

    Section A) What you need to get started..

    Since you're reading this thread i'll assume you have a computer. Python is a platform independent scripting language, which means that it *should* run the same across different operating systems [Windows, Mac, Unix etc]. For this tutorial, i'm going to assume you have Mac/Linux becuase that is what I'm familiar with. However, it should be pretty easy to generalize to Windows.

    Downloading Python
    If you're on windows you will need to download Python and Idle [ http://www.python.org/download/ ]
    Get version 2.6.* -- don't get version 3. A lot has changed in version 3, and most old code is not supported, making it a pain in the ass. Trust me on this. Version 2.6.* is what you want.

    Good news, If you're on Mac or Linux, you probably already have python!
    Open up terminal [Mac users hit apple+space to bring up spotlight, and type in terminal].
    Type in "python -V" and press enter. It should tell you which version of python is installed. Even if it's not version 2.6.*, it will probably still do, as long as it's > 2.3 and < 3.0

    Writing Python Programs
    Python programs should be writted in a text editor, in a monospaced font...
    Windows Users: There's a good editor called "notepad++" google it. Alternatively when you download python it will come with an editor. You could use that...
    Mac Users: I like a program called "TextMate", though you need to pay for it. There's probably a free trial somewhere.

    Section B) Basics of Programming..

    Learning Python:
    I could type up a basic tutorial in python, but i'd be reinventing the wheel. John wrote a great introduction to programming that you can find here: http://books.google.com/books?id=aJQ...age&q=&f=false

    I'd suggest you read this through. Read at least the first 4 chapters. Spend a day and DO THE EXAMPLES. The only way to learn programming is by doing. It's really not hard stuff, it just takes some time to get the basics. Again, don't just read it or you will learn nothing. Take some time and practice practice practice. You can post questions or snippets of code in this thread if you're having problems. I'm sure I, or someone else can find and fix your problem.

    Section C) Basics of data input & output..

    If you have gotten to this point, you should already know the basics of python. You should know what an "if statement" is, what a "for loop" is, and how to print "Hello World!".

    In general the tasks we are trying to do with python will either be taking data from excel and manipulating/running tests on it, or getting data from the internet, and writing it to an excel file for easier access. We can do both with python! Excel takes in what is known as a "CSV" or comma separated file, and displays it in spreadsheet format, so all we have to do is have our python program output a file that is comma separated -- and we can load it right into excel.

    Let's start with a simple example. I have uploaded a .csv file to my website, it contains MLB game information for a single day. Download and save this file into the same directory that your python script will run from. If you open the file in excel, you will get a better idea of what is inside it. You'll find the file here: http://atbgreen.com/mlb_ex_1.csv

    [Opening and Reading a .csv File]
    PHP Code:
    ## Created on 4/1/10
    ## This example should shows how to open and read a .csv file.
    ##
    ## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
    ## .. as this scrpt

    ## Tell python to import the .csv module, becuase we will be reading a .csv
    import csv

    ## First we need to open the file
    mlb_file    open("mlb_ex_1.csv","r"## Open the MLB .csv file for reading

    ## Now we need to tell python to read it as a csv file
    ## We are opening the mlb_file as defined above, it is deliminated by commas,
    ## and our quote characted is a regular quote (")
    mlbReader csv.reader(mlb_filedelimiter=','quotechar='"'

    ## Grab the first line, becuase it is the headers..
    headers mlbReader.next()

    ## It's now time to iterate through the file row by row...
    for row in mlbReader:
        
    ## Let's try and only print the Over/Under Line, and the actual runs scored
        ## .. in the game. If you look at the .csv in excel you will see these are
        ## .. in the 6 & 7 columns. But since the computer starts counting at 0, 
        ## .. we would say they are in the 5th and 6th columns
        
    ou_line     float(row[5]) ## This should be a float, becase it can be .5
         
    runs_scored int(row[6])   ## This will be an int, becuase runs are integers

        
    print "The line was",ou_line,"and",runs_scored,"runs were scored"

    ## End of program 
    Save and run the code. I've commented it generously so you can tell exactly whats going on. It looks long, but it's only becuase i've tried to make it as clear as possible. If I wanted, i could compress the code into 3 lines -- but it's not nearly as easy to understand.

    [Opening and Reading a .csv File (in 3 lines)]
    PHP Code:
    import csv
    mlbReader 
    csv.reader(open("mlb_ex_1.csv"),delimiter=',',quotechar='"')
    for 
    row in mlbReader: print "The line was",row[5],"and",row[6],"runs were scored" 
    Let's go a step further this time, and do some calculations with our file. Let's determine whether the game went over or under.
    [Opening and Reading a .csv File, and determining over or under]
    PHP Code:
    ## Created on 4/1/10
    ## This example should shows how to open and read a .csv file, and perform
    ## .. some simple calculations
    ##
    ## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
    ## .. as this scrpt

    ## Tell python to import the .csv module, becuase we will be reading a .csv
    import csv

    total_overs 
    0     ## Initialize the total number of overs to 0
    total_unders 0   ## Initialize the total number of unders to 0

    ## First we need to open the file
    mlb_file    open("mlb_ex_1.csv","r"## Open the MLB .csv file for reading

    ## Now we need to tell python to read it as a csv file
    ## We are opening the mlb_file as defined above, it is deliminated by commas,
    ## and our quote characted is a regular quote (")
    mlbReader csv.reader(mlb_filedelimiter=','quotechar='"'

    ## Grab the first line, becuase it is the headers..
    headers mlbReader.next()

    ## It's now time to iterate through the file row by row...
    for row in mlbReader:
        
    ## First we need to get the OU_Line, and runs scored out of the file.
        
    ou_line     float(row[5]) ## This should be a float, becase it can be .5
        
    runs_scored int(row[6])   ## This will be an int, becuase runs are integers

        ## Now lets compare the two with an if statement to see what happened:
        
    if ou_line runs_scored:
            
    ou_result "Under"
            
    total_unders += 1
        elif ou_line 
    runs_scored:
            
    ou_result "Over"
            
    total_overs += 1
        
    else:
            
    ou_result "Push"

        
    ## Calculate the percent of games that went over, and round it to 2 decimal places.
        
    over_under_percentage round((total_overs float(total_overs total_unders)),2)*100
        
    ## Finally let's put it all together in one print statement
        
    print "The line was",ou_line,"and",runs_scored,"runs were scored, so the game went",ou_result
        
    ## END OF FOR LOOP

    print "There were",total_overs,"Overs"
    print "There were",total_unders,"Unders"
    print over_under_percentage,"percent of games went Over"

    ## End of program 
    That's really all there is to reading in a file. What you do after you have read the file in is completely up to you. All the columns are accessible in the "row" array, and can be accessed by asking for a position out of the array. Remember the position is always one less then its column number. For example, if you want the 7th column, you would do row[6].

    Let's move on to data output. Let's further expand on our old example, and say after we calculate whether the game went over or under, we want to write it to a new file. We want our new file to have three columns. Date, Teams, OverUnder. If we look in our sheet we will see that the date and teams are in columns 1 and 3 respectively. We will call our new file MLB_output.csv

    PHP Code:
    ## Created on 4/1/10
    ## This example should shows how to open and read a .csv file, and perform
    ## .. some simple calculations
    ##
    ## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
    ## .. as this scrpt

    ## Tell python to import the .csv module, becuase we will be reading a .csv
    import csv

    ## First we need to open both files
    mlb_file    open("mlb_ex_1.csv","r"## Open the MLB .csv file for reading
    output_file open("MLB_output.csv","w"## Open the output file for writing

    ## Now we need to tell python to read it as a csv file
    ## We are opening the mlb_file as defined above, it is deliminated by commas,
    ## and our quote characted is a regular quote (")
    mlbReader csv.reader(mlb_filedelimiter=','quotechar='"'

    ## We'll do the same for our writer. We need to tell it where we will be writing
    ## .. to, and what kind of delimiters we want to use.
    mlbWriter csv.writer(output_filedelimiter=','quotechar='"')

    ## Grab the first line, becuase it is the headers..
    headers mlbReader.next()

    ## It's now time to iterate through the file row by row...
    for row in mlbReader:
        
    ## First we need to get the OU_Line, and runs scored out of the file.
        
    ou_line     float(row[5]) ## This should be a float, becase it can be .5
        
    runs_scored int(row[6])   ## This will be an int, becuase runs are integers

        ## Now lets get the other information we need out (Date and Teams)
        
    date    row[0]
        
    teams   row[2]

        
    ## Now lets compare the two with an if statement to see what happened:
        
    if ou_line runs_scored:
            
    ou_result "Under"
        
    elif ou_line runs_scored:
            
    ou_result "Over"
        
    else:
            
    ou_result "Push"

        
    ## Instead of printing here like we did before, we want to write to the file
        
    mlbWriter.writerow([date,teams,ou_result])
        
    ## END OF FOR LOOP

    output_file.close() ## Close the file after we have written everything
    print "The program has written everything!"
    ## End of program 
    Try running the program. After you do, you should see a new file has been created. This file will contain exactly what we expect

    Code:
    9/21/09,atl at nyn,Under
    9/21/09,bal at tor,Under
    9/21/09,bos at kca,Under
    9/21/09,chn at mil,Under
    9/21/09,min at cha,Over
    9/21/09,nya at ana,Over
    9/21/09,sdn at pit,Under
    9/21/09,sln at hou,Under
    9/21/09,tex at oak,Under
    That's really all there is to basic input and output of files!

    Section D) How to scrape the internet for data

    To be continued......

    Last edited by ljump12; 04-02-10 at 11:56 AM.
    Points Awarded:

    Pokerjoe gave ljump12 5 SBR Point(s) for this post.

    tbonmusikman gave ljump12 10 SBR Point(s) for this post.

    Jule gave ljump12 2 SBR Point(s) for this post.

    slickeddie gave ljump12 5 SBR Point(s) for this post.

    mikmak gave ljump12 1 SBR Point(s) for this post.

    Odessa gave ljump12 10 SBR Point(s) for this post.

    kaalhode gave ljump12 1 Betpoint(s) for this post.

    Nomination(s):
    This post was nominated 15 times . To view the nominated thread please click here. People who nominated: gamecock0118, dbDan, jhol3990, jotoha, slickeddie, Das Jax, CityCowboy, SportsMushroom, chisox19, sweep, Odessa, kaalhode, tonyd85, vavoulas, and ajhinojosa

  2. #2
    MrX
    MrX's Avatar
    Join Date: 01-10-06
    Posts: 1,540

    Potentially one of the best threads on here. Good job.

    I would suggest throwing some mysql into the mix.
    Nomination(s):
    This post was nominated 1 time . To view the nominated thread please click here. People who nominated: eltoro17

  3. #3
    IrishTim
    IrishTim's Avatar
    Join Date: 07-23-09
    Posts: 983
    Betpoints: 127

    Quote Originally Posted by MrX View Post
    Potentially one of the best threads on here. Good job.

    I would suggest throwing some mysql into the mix.
    No doubt. Looking forward to see this one progress.

  4. #4
    Jule
    Jule's Avatar
    Join Date: 04-02-10
    Posts: 404

    Great info. :-)

  5. #5
    jessetopolski
    912
    jessetopolski's Avatar
    Join Date: 12-20-09
    Posts: 162

    intresting

  6. #6
    uva3021
    uva3021's Avatar
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    great thread, looking forward to continue reading

  7. #7
    statnerds
    Put me in coach
    statnerds's Avatar
    Join Date: 09-23-09
    Posts: 4,047
    Betpoints: 103

    excellent.

    thanks for the huge contribution. i will try my hand at this

  8. #8
    Wrecktangle
    Wrecktangle's Avatar
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    Nicely written article. Are there any subroutine libraries we can obtain so we don't have to write all our typically used routines from scratch?

  9. #9
    ljump12
    ljump12's Avatar
    Join Date: 12-08-09
    Posts: 108
    Betpoints: 258

    As far as Python and webscraping goes, your going to want to obtain BeautifulSoup and something called Mechanize. I intend to write about these in the next section. I may try to write up a commonly used betting function library, I have a bunch of functions lying around that i could probably scrape together.

  10. #10
    benjaminj78
    benjaminj78's Avatar
    Join Date: 03-13-10
    Posts: 15
    Betpoints: 60

    Python is a lifesaver when learning to code. This is the best reference anywhere,in regards to building the ultimate spreadsheet set-up. Thanks!

  11. #11
    Wrecktangle
    Wrecktangle's Avatar
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    OK, is there something about Python that makes is easier or more forgiving to program in than other languages?

    Why not use some variant of C? or VBA? or Java?

  12. #12
    ljump12
    ljump12's Avatar
    Join Date: 12-08-09
    Posts: 108
    Betpoints: 258

    Quote Originally Posted by Wrecktangle View Post
    OK, is there something about Python that makes is easier or more forgiving to program in than other languages?

    Why not use some variant of C? or VBA? or Java?
    Python is infinitely easier than C, Visual basic and Java. It will become more apparent the more you do with it, but take a look at the code to open and parse a CSV file. 3 readable lines. Show me a short elegant solution like that in C or Java. Don't get me wrong, compiled languages like C and Java have their place -- and are much much more efficient. But for this type of project, you will want to use a scripting language, (Python, Ruby, Perl etc...) I've just chosen Python because it's what I'm most familiar with.

    Bottom line: I can write something in Python in 1/3 the time it takes me to write something in C or Java, with increased readability/maintainability, and without losing any functionality.
    Nomination(s):
    This post was nominated 1 time . To view the nominated thread please click here. People who nominated: vavoulas

  13. #13
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    It may be slightly faster to code parsing a CSV, but why would you be working much with CSV's anyway? Personally, I hate them. Generally, you'd be scraping directly into a DB of some sort anyway. At least, you probably should.

  14. #14
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    Also, I prefer my efficiency to be on the executable side and not the coding side of my projects. However, for the purposes of this thread, I would agree that Python is far better as a tutorial on a more efficient language would be much more involved.

  15. #15
    Daniel
    Daniel's Avatar
    Join Date: 03-30-10
    Posts: 22

    Not that it really matters, but parsing a CSV file with C# is just a few lines of code too .

    Code:
    Textreader tr = new StreamReader(fileName);
    string line;
    while ((line = tr.ReadLine()) != null)
    {
       string[] brokenDown = line.Split(',');
       // Do what you want with an array of strings split by the commasign
    }
    What it boils down to is obviously language preference.. everyone has their own language that they feel the most comfortable with. The end result is still based more on the logic than which language you used.

  16. #16
    Wrecktangle
    Wrecktangle's Avatar
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    I guess this sort of guess to my point. Granted Python is good for scraping, what does it have for me as far as modeling sports? If I'm a newbie, not wanting to learn two languages to do my programming (one to scrape, one or organize my modeling efforts, perhaps R) which one should I use? In my world, checked out scientific routines are important. Reinventing statistical wheels might be fun for some, but I rather directly model sports techniques.

  17. #17
    Daniel
    Daniel's Avatar
    Join Date: 03-30-10
    Posts: 22

    Like I said, language is secondary. Choose the one you're comfortable using, be it Python or .net or whatever. For this project, it's still going to be the output that matters. As far as I know, no language is more suited to statistical analysis than another. At least not for a project such as this.

    I'm a .net developer, I would chose C# over Python every day of the week and twice on Sundays.. but for someone else the preferences is probably completely opposite. That doesn't mean that I can create something that more accurately analyzes data than the other guy.

  18. #18
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    It also depends on your environment. .NET won't do you much good in Linux. That said, I use .NET to code the majority of my handicapping apps.

  19. #19
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    If you'd like to concentrate on only one language, I'd honestly go with C#. There are a lot of things that I do with arrays and structs in my models that would be extremely cumbersome and inefficient in Python.

  20. #20
    ljump12
    ljump12's Avatar
    Join Date: 12-08-09
    Posts: 108
    Betpoints: 258

    Quote Originally Posted by Wrecktangle View Post
    I guess this sort of guess to my point. Granted Python is good for scraping, what does it have for me as far as modeling sports? If I'm a newbie, not wanting to learn two languages to do my programming (one to scrape, one or organize my modeling efforts, perhaps R) which one should I use? In my world, checked out scientific routines are important. Reinventing statistical wheels might be fun for some, but I rather directly model sports techniques.
    R is completely different from python, they do very different things, and both are very valuable to know. Python is more versitile than R in my opinion, and if you were only going to learn one language, python would be it imo. However, I don't want to waste more time arguing for python, it's mostly a personal choice -- you can use any language you'd like, the concepts taught here should still generally apply.

  21. #21
    sharpcat
    sharpcat's Avatar
    Join Date: 12-19-09
    Posts: 4,516

    Ijump12, I think many are interested to see your write-up Please continue

    As far as this arguing about what language every body prefers to code in please start your own threads and let the man continue with his thread, lets allow this to be the educational thread it was meant be. I am sure everybody is capable of starting their own thread if they feel the need to prove that they are more intelligent and that their technique is better.
    Points Awarded:

    Pokerjoe gave sharpcat 1 SBR Point(s) for this post.


  22. #22
    ljump12
    ljump12's Avatar
    Join Date: 12-08-09
    Posts: 108
    Betpoints: 258

    Quote Originally Posted by MonkeyF0cker View Post
    If you'd like to concentrate on only one language, I'd honestly go with C#. There are a lot of things that I do with arrays and structs in my models that would be extremely cumbersome and inefficient in Python.
    I think efficiency is kind of a moot point at this stage. Were not dealing with things in handicapping that efficiency would not matter. I've written a python baseball simulator that processes millions of rows of PlayByPlay data, and it has no trouble.

  23. #23
    OMGRandyJackson
    OMGRandyJackson's Avatar
    Join Date: 02-07-10
    Posts: 1,680
    Betpoints: 4054

    Once I get some free time, Im going to check out the python tutorial and get started on this. Cannot wait for your to continue with the topics!

    Thanks so much man!

  24. #24
    ljump12
    ljump12's Avatar
    Join Date: 12-08-09
    Posts: 108
    Betpoints: 258

    Section D) How to scrape the internet for data

    One of the most important aspects of research is the data that you have. Without data, there can't be any model. Fortunately, most data is free -- Unfortunately, most data isn't immediately in the best computer parsable formats [like .csv, or .xml]. To get the data into formats we can use we will need to "scrape" websites for it.

    A couple "packages" have been created that will greatly improve our ability to scrape webpages. It can certaintly be done in python without them -- but they will make your life a whole lot easier:

    Mechanize - This will allow us to open webpages easily (http://wwwsearch.sourceforge.net/mechanize/)
    Beautiful Soup - This will allow us to parse apart the webpages (http://www.crummy.com/software/BeautifulSoup/)

    Installing Beautiful Soup is pretty easy, you can just put the http://www.crummy.com/software/Beaut...lSoup-3.0.0.py Beautiful soup python file in the same directory you are running your code from.

    Installing Mechanize is a little tougher, on a *nix machine, cd to the directory of where you downloaded it and extract it (tar -xzvf [filename]). Then cd into the extracted directory and install it by typing "sudo python setup.py install" It should install, you can post here if you have any problems. As far as windows goes, you may be on your own -- I can't imagine it's very tough, and there's probably a tutorial somewhere online.

    Now that the installation is out of the way, it's time to get down to business. I'll give you the basics here, and you should be able to refer to the documentation for more complicated examples. I'm going to assume you have a basic familiarity of html -- if you don't, you may want to search for a quick tutorial. Let's make our first example getting a list of today's injuries from statfox for MLB baseball:

    PHP Code:


    from BeautifulSoup import BeautifulSoup
    SoupStrainer ## This tells python to use Beautiful Soup
    from mechanize import Browser   ## This tells python we want to use a browser (which is defined in mechanize)
    import re   ## This tells python that we will be using some regular expressions.
                ## .. Regular expression allow us to search for a sequence of characters
                ## .. within a larger string
    import time
    import datetime

    ## The first step is to create our browser..
    br Browser()

    ## Now let's open the injuries page on statfox. This one line will open and retreive the html.
    response br.open("http://www.sbrodds.com/StoryArchivesForm.aspx?ShortNameLeague=mlb&ArticleType=injury&l=3").read()

    ## Now we need to tell Beautiful Soup that we would like to search through the response.
    ## .. This next line will tell beautiful soup to only return links to the individual inuries.
    ## .. We know that all the links to the injuries have "ShortNameLeague=mlb&ArticleType=injury" 
    ## .. in their url, so we search for these links. Each of these links has a title that describes
    ## .. the injury which we will use in the next line.
    linksToInjuries SoupStrainer('a'href=re.compile('ShortNameLeague=mlb&ArticleType=injury'))

    ## This will put the title of all links in the "linksToInjuries" into an array.
    ## We then call Set on our array to change the array to a "set" which by definition has no duplicates.
    injuryTitles set([injuryPage['title'] for injuryPage in BeautifulSoup(responseparseOnlyThese=linksToInjuries)])


    ## Finally let's print all the injuries out that are for today's date.
    today datetime.date.today()
    # the function strftime() (string-format time) produces nice formatting
    # All codes are detailed at http://www.python.org/doc/current/lib/module-time.html
    date =  today.strftime("%m/%d"

    ## Now let's print out the injuries that we have.
    for title in injuryTitles:
        
    ## See if the date is in the title, if it is: print it.
        
    if re.search(datetitle):
            print 
    title 
    It might seem like a lot at first, but it's not much code. Take it slow and use google when you dont know what a function does. Googling "python [some piece of code you dont understand]" will work magic. Ask here and i can further break down any slice of code.

    Sorry I haven't had much time -- If anyone can post an example of what kind of data they would like to be scraped, I will create one more example using both BeautifulSoup and Mechanize.

  25. #25
    pats3peat
    LETS GO PATS
    pats3peat's Avatar
    Join Date: 10-23-05
    Posts: 1,163

    Got to love research, one o the best things in sports

  26. #26
    MadTiger
    Wait 'til next year!
    MadTiger's Avatar
    Join Date: 04-19-09
    Posts: 2,724
    Betpoints: 47

    Quote Originally Posted by Wrecktangle View Post
    ... although I suppose I could call them through Python, if it is allowed. ...
    It is very allowed and possible http://www.omegahat.org/RSPython/

    (ex-developer here. Ancient. These languages are new to me, but mixed languages have been my thing for a while.)

  27. #27
    sycoogtit
    play matchbook
    sycoogtit's Avatar
    Join Date: 02-11-10
    Posts: 322

    Very nice thread ljump12. Your elegant python examples have convinced a perl programmer to spend a bit more time with python.

    However, I'm conflicted. This is selfish of me, but as sports bettors we have to be selfish when it comes to this. If everyone knows about an edge, then it isn't an edge anymore. Do we really want to be giving everyone these step-by-step instructions on how to research betting trends? The information on how to program web scrapers is widely available, but putting it all down right here has made it significantly easier to learn how to apply it directly to our field.

    I'm sure you thought of this before you started this thread -- I guess I'm curious what your thoughts are.

  28. #28
    ljump12
    ljump12's Avatar
    Join Date: 12-08-09
    Posts: 108
    Betpoints: 258

    Quote Originally Posted by sycoogtit View Post
    Very nice thread ljump12. Your elegant python examples have convinced a perl programmer to spend a bit more time with python.

    However, I'm conflicted. This is selfish of me, but as sports bettors we have to be selfish when it comes to this. If everyone knows about an edge, then it isn't an edge anymore. Do we really want to be giving everyone these step-by-step instructions on how to research betting trends? The information on how to program web scrapers is widely available, but putting it all down right here has made it significantly easier to learn how to apply it directly to our field.

    I'm sure you thought of this before you started this thread -- I guess I'm curious what your thoughts are.
    This is a very valid concern. Here's the thing, and its kind of selfish on my part too. I'm not, and probably won't be a huge sports bettor. It's not that i cant be... It's something if I put 100% effort into i believe i could do well, but I don't really want to. Since im not doing it, i figure i may as well help other people. You may feel differently about what I'm doing, and I can totally respect that. I guess the bottom line is that, even given these tools and this "tutorial" (if you could call it that), not many are going to follow through with it, so i wouldn't be too worried.

    Finally one of my biggest hopes for this thread is that it so sparks discussion. Please feel free to post on anything related..

  29. #29
    IrishTim
    IrishTim's Avatar
    Join Date: 07-23-09
    Posts: 983
    Betpoints: 127

    I see where both of you guys are coming from, but I tend to agree with ljump here. I don't think we're going to have 100 clowns from players talk see this thread and all of the sudden go from looking for the 100 unit lock of the century to setting up web scrapers, churning out dbs with 20k samples, and firing away +EV plays into soft spots in the market by Friday. My guess would be that most of the people who have the patience (and intelligent quotient) to read, understand, and apply the lessons in this thread already know how to do this type of programming or who have contacts who they share/get help from.

    As long as you aren't attaching databases with +EV models to each post, I think everyone is going to be okay.

  30. #30
    romanowski
    romanowski's Avatar
    Join Date: 06-14-06
    Posts: 85
    Betpoints: 84

    most people are too lazy to do any of this, I wouldnt worry about losing any edge

  31. #31
    frankzig
    frankzig's Avatar SBR PRO
    Join Date: 10-26-09
    Posts: 2,250
    Betpoints: 2242

    this is nice

  32. #32
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    Relatively similar? LOL. I hope you're joking.

    Some people have only box scores, some have play by play data, some have pitch by pitch data, some people have linked line history tables, some people have closing number columns, some have linked player tables with keys, some have individual tables for each game, some people perform simple system or prop queries, some perform a series of queries to populate and process a model, etc., etc., etc., etc., etc. Not to mention, there are probably more than 3 billion ways that one can go about doing the exact same thing.

    AGAIN, IT ALL DEPENDS ON YOUR DATASET AND HOW YOU PLAN ON USING YOUR DATA!!!!!!!

    Anyone whose profession is in data warehousing should be able to grasp this simple concept the first time they are told. However, they certainly shouldn't need to be told this in the first place.
    Last edited by MonkeyF0cker; 04-16-10 at 11:13 PM.
    Points Awarded:

    Maverick22 gave MonkeyF0cker 5 SBR Point(s) for this post.


  33. #33
    durito
    escarabajo negro
    durito's Avatar
    Join Date: 07-03-06
    Posts: 13,173
    Betpoints: 438

    Yea, just ignore the odds, that should work out perfectly.

  34. #34
    Wrecktangle
    Wrecktangle's Avatar
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    I'm always struck by how hard it can be to express yourself in print, and the fact that we all use differing terms to label the same items. I'm not a data base guy but data dictionary "thingies" are important even in my simplistic world. I would like to see us stay away from the Players Talk way of solving differences of opinion here in the Tank, however.

    I keep saying this to no observable progress: I'd like to see a group form where the interest is sharing checked out data sets. I spend way too much time cross checking data and way too little time on model building and analysis; especially the analysis.

  35. #35
    MonkeyF0cker
    Update your status
    MonkeyF0cker's Avatar
    Join Date: 06-12-07
    Posts: 12,144
    Betpoints: 1127

    The reason his statement was confusing is because he wasn't using the proper terminology, Wrecktangle. If someone attacks my integrity in here, I'm certainly going to prove my point. If you don't design your data tables to coincide with your end product, you'll likely create a ton of unnecessary work for yourself and inefficiencies in the modelling phase. It can make your queries a nightmare to code and process.

    As far as sharing datasets, I have no interest in that. I do everything programmatically and I think I have far more reliable data than the vast majority of posters here. I really doubt I'd get any desirable reciprocation for my work. Not to mention, I'm not one to trust other people's work when it comes to these things. If someone gave me a set of data, the first thing I'd do is verify its integrity. So it would be a completely unproductive process for me.

1234 ... Last
Top