Originally posted on 04/02/2010:

I'm writing this post to serve as an intro into using computers for sports betting. Programming isn't as hard as most people think, and the basic skills can be picked up on a weekend. This will be by no means an extensive resource, but will rather be a brief introduction. It is my belief that the best way to beat the books is with extensive research and backtesting. What is taught here will not give you the answers, there are no 20*play GOTY locks in this thread, only the tools that will allow you to succeed. Also note this is very much a work in progress, I will post new sections as I write them. If you have suggestions, would like to contribute etc. etc. just post!

Sections:
1) Intro to programming (Taught in Python)
a) What you need to get started
b) Basics of programming
c) Basics of data input & output
d) How to scrape the internet for data
e) How to manipulate the data for excel
2) Intro to excel
a) How to load in data files
b) What can be done in excel
3) Intro to standard wagering ideas
a) Arbitrage
b) Kelly Criterion
Python is one of many programming languages, and it allows us to work gather,manipulate and apply data. I believe Python is the best language for a beginner to learn becuase it reads like english, but is still extremely powerful.

Section A) What you need to get started..

Since you're reading this thread i'll assume you have a computer. Python is a platform independent scripting language, which means that it *should* run the same across different operating systems [Windows, Mac, Unix etc]. For this tutorial, i'm going to assume you have Mac/Linux becuase that is what I'm familiar with. However, it should be pretty easy to generalize to Windows.

Downloading Python
If you're on windows you will need to download Python and Idle [ http://www.python.org/download/ ]
Get version 2.6.* -- don't get version 3. A lot has changed in version 3, and most old code is not supported, making it a pain in the ass. Trust me on this. Version 2.6.* is what you want.

Good news, If you're on Mac or Linux, you probably already have python!
Open up terminal [Mac users hit apple+space to bring up spotlight, and type in terminal].
Type in "python -V" and press enter. It should tell you which version of python is installed. Even if it's not version 2.6.*, it will probably still do, as long as it's > 2.3 and < 3.0

Writing Python Programs
Python programs should be writted in a text editor, in a monospaced font...
Windows Users: There's a good editor called "notepad++" google it. Alternatively when you download python it will come with an editor. You could use that...
Mac Users: I like a program called "TextMate", though you need to pay for it. There's probably a free trial somewhere.

Section B) Basics of Programming..

Learning Python:
I could type up a basic tutorial in python, but i'd be reinventing the wheel. John wrote a great introduction to programming that you can find here: http://books.google.com/books?id=aJQ...age&q=&f=false

I'd suggest you read this through. Read at least the first 4 chapters. Spend a day and DO THE EXAMPLES. The only way to learn programming is by doing. It's really not hard stuff, it just takes some time to get the basics. Again, don't just read it or you will learn nothing. Take some time and practice practice practice. You can post questions or snippets of code in this thread if you're having problems. I'm sure I, or someone else can find and fix your problem.

Section C) Basics of data input & output..

If you have gotten to this point, you should already know the basics of python. You should know what an "if statement" is, what a "for loop" is, and how to print "Hello World!".

In general the tasks we are trying to do with python will either be taking data from excel and manipulating/running tests on it, or getting data from the internet, and writing it to an excel file for easier access. We can do both with python! Excel takes in what is known as a "CSV" or comma separated file, and displays it in spreadsheet format, so all we have to do is have our python program output a file that is comma separated -- and we can load it right into excel.

Let's start with a simple example. I have uploaded a .csv file to my website, it contains MLB game information for a single day. Download and save this file into the same directory that your python script will run from. If you open the file in excel, you will get a better idea of what is inside it. You'll find the file here: http://atbgreen.com/mlb_ex_1.csv

[Opening and Reading a .csv File]
PHP Code:
## Created on 4/1/10
## This example should shows how to open and read a .csv file.
##
## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
## .. as this scrpt

## Tell python to import the .csv module, becuase we will be reading a .csv
import csv

## First we need to open the file
mlb_file    open("mlb_ex_1.csv","r"## Open the MLB .csv file for reading

## Now we need to tell python to read it as a csv file
## We are opening the mlb_file as defined above, it is deliminated by commas,
## and our quote characted is a regular quote (")
mlbReader csv.reader(mlb_filedelimiter=','quotechar='"'

## Grab the first line, becuase it is the headers..
headers mlbReader.next()

## It's now time to iterate through the file row by row...
for row in mlbReader:
    
## Let's try and only print the Over/Under Line, and the actual runs scored
    ## .. in the game. If you look at the .csv in excel you will see these are
    ## .. in the 6 & 7 columns. But since the computer starts counting at 0, 
    ## .. we would say they are in the 5th and 6th columns
    
ou_line     float(row[5]) ## This should be a float, becase it can be .5
     
runs_scored int(row[6])   ## This will be an int, becuase runs are integers

    
print "The line was",ou_line,"and",runs_scored,"runs were scored"

## End of program 
Save and run the code. I've commented it generously so you can tell exactly whats going on. It looks long, but it's only becuase i've tried to make it as clear as possible. If I wanted, i could compress the code into 3 lines -- but it's not nearly as easy to understand.

[Opening and Reading a .csv File (in 3 lines)]
PHP Code:
import csv
mlbReader 
csv.reader(open("mlb_ex_1.csv"),delimiter=',',quotechar='"')
for 
row in mlbReader: print "The line was",row[5],"and",row[6],"runs were scored" 
Let's go a step further this time, and do some calculations with our file. Let's determine whether the game went over or under.
[Opening and Reading a .csv File, and determining over or under]
PHP Code:
## Created on 4/1/10
## This example should shows how to open and read a .csv file, and perform
## .. some simple calculations
##
## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
## .. as this scrpt

## Tell python to import the .csv module, becuase we will be reading a .csv
import csv

total_overs 
0     ## Initialize the total number of overs to 0
total_unders 0   ## Initialize the total number of unders to 0

## First we need to open the file
mlb_file    open("mlb_ex_1.csv","r"## Open the MLB .csv file for reading

## Now we need to tell python to read it as a csv file
## We are opening the mlb_file as defined above, it is deliminated by commas,
## and our quote characted is a regular quote (")
mlbReader csv.reader(mlb_filedelimiter=','quotechar='"'

## Grab the first line, becuase it is the headers..
headers mlbReader.next()

## It's now time to iterate through the file row by row...
for row in mlbReader:
    
## First we need to get the OU_Line, and runs scored out of the file.
    
ou_line     float(row[5]) ## This should be a float, becase it can be .5
    
runs_scored int(row[6])   ## This will be an int, becuase runs are integers

    ## Now lets compare the two with an if statement to see what happened:
    
if ou_line runs_scored:
        
ou_result "Under"
        
total_unders += 1
    elif ou_line 
runs_scored:
        
ou_result "Over"
        
total_overs += 1
    
else:
        
ou_result "Push"

    
## Calculate the percent of games that went over, and round it to 2 decimal places.
    
over_under_percentage round((total_overs float(total_overs total_unders)),2)*100
    
## Finally let's put it all together in one print statement
    
print "The line was",ou_line,"and",runs_scored,"runs were scored, so the game went",ou_result
    
## END OF FOR LOOP

print "There were",total_overs,"Overs"
print "There were",total_unders,"Unders"
print over_under_percentage,"percent of games went Over"

## End of program 
That's really all there is to reading in a file. What you do after you have read the file in is completely up to you. All the columns are accessible in the "row" array, and can be accessed by asking for a position out of the array. Remember the position is always one less then its column number. For example, if you want the 7th column, you would do row[6].

Let's move on to data output. Let's further expand on our old example, and say after we calculate whether the game went over or under, we want to write it to a new file. We want our new file to have three columns. Date, Teams, OverUnder. If we look in our sheet we will see that the date and teams are in columns 1 and 3 respectively. We will call our new file MLB_output.csv

PHP Code:
## Created on 4/1/10
## This example should shows how to open and read a .csv file, and perform
## .. some simple calculations
##
## Notes: The file mlb_ex_1.csv should be downloaded and in the same folder
## .. as this scrpt

## Tell python to import the .csv module, becuase we will be reading a .csv
import csv

## First we need to open both files
mlb_file    open("mlb_ex_1.csv","r"## Open the MLB .csv file for reading
output_file open("MLB_output.csv","w"## Open the output file for writing

## Now we need to tell python to read it as a csv file
## We are opening the mlb_file as defined above, it is deliminated by commas,
## and our quote characted is a regular quote (")
mlbReader csv.reader(mlb_filedelimiter=','quotechar='"'

## We'll do the same for our writer. We need to tell it where we will be writing
## .. to, and what kind of delimiters we want to use.
mlbWriter csv.writer(output_filedelimiter=','quotechar='"')

## Grab the first line, becuase it is the headers..
headers mlbReader.next()

## It's now time to iterate through the file row by row...
for row in mlbReader:
    
## First we need to get the OU_Line, and runs scored out of the file.
    
ou_line     float(row[5]) ## This should be a float, becase it can be .5
    
runs_scored int(row[6])   ## This will be an int, becuase runs are integers

    ## Now lets get the other information we need out (Date and Teams)
    
date    row[0]
    
teams   row[2]

    
## Now lets compare the two with an if statement to see what happened:
    
if ou_line runs_scored:
        
ou_result "Under"
    
elif ou_line runs_scored:
        
ou_result "Over"
    
else:
        
ou_result "Push"

    
## Instead of printing here like we did before, we want to write to the file
    
mlbWriter.writerow([date,teams,ou_result])
    
## END OF FOR LOOP

output_file.close() ## Close the file after we have written everything
print "The program has written everything!"
## End of program 
Try running the program. After you do, you should see a new file has been created. This file will contain exactly what we expect

Code:
9/21/09,atl at nyn,Under
9/21/09,bal at tor,Under
9/21/09,bos at kca,Under
9/21/09,chn at mil,Under
9/21/09,min at cha,Over
9/21/09,nya at ana,Over
9/21/09,sdn at pit,Under
9/21/09,sln at hou,Under
9/21/09,tex at oak,Under
That's really all there is to basic input and output of files!

Section D) How to scrape the internet for data

To be continued......