1. #1
    Romanov
    Diapason Knells
    Romanov's Avatar Become A Pro!
    Join Date: 10-08-10
    Posts: 4,137
    Betpoints: 13

    Help Creating Line Scraper in Python

    Hi,

    I've recently learned a little about Python. Basic BASIC stuff. I would like to learn how to scrape sbrodds or pinnacle's odds (and convert them from decimal form) and then input that data into an excel spreadsheet. Any guidance you could lend would be great. Thanks

  2. #2
    WendysRox
    WendysRox's Avatar Become A Pro!
    Join Date: 07-22-10
    Posts: 184
    Betpoints: 37

    i use excel 2003 to scrape data from a few sites. Don't know how to do it in newer versions (because I like 2003 the best), but here's how I do it: Go to the "data" menu, then click "import external data", then choose "new web query". It will open a browser window with little yellow boxes next to every table. just click the box next to the table containing the data you want and click "ok" or "import". If you'll then save the spreadsheet, then the next time you open it all you have to do is "data" and "refresh data". I saved mine as "NBA Template" and "NCAAB Template". Every day, I'll open these templates, get the data, then save the updated sheet as "NBA 02-09-2011" or "NCAAB 02-09-2011" or whatever. Works great for me. Good luck!

  3. #3
    mark49
    mark49's Avatar Become A Pro!
    Join Date: 03-03-08
    Posts: 42
    Betpoints: 1652

    like WendysRox says, you may not need to do this through Python.

    It all depends on what you want to do exactly.
    Do you want odds at a specific time, Opening or Closing or every point in-between ?
    What do you need your spreadsheet to record ?
    How accurately do you want the odds recording ? and converted to which format ?

    I personally would just paste Pinnacles lines into a worksheet and then use a macro to store the information into whatever format I wanted. Very simple to do.

  4. #4
    Romanov
    Diapason Knells
    Romanov's Avatar Become A Pro!
    Join Date: 10-08-10
    Posts: 4,137
    Betpoints: 13

    Okay, I'm using excel 2008 for mac so its a little different. I can get excel to TRY and download from the web, but I do not know the address/ parameters for sbrodds in order to pull that info

  5. #5
    mike1234
    ??
    mike1234's Avatar SBR PRO
    Join Date: 09-06-07
    Posts: 443
    Betpoints: 17377

    I think sbrodds uses AJAX. It might be more difficult to scrape - but really not sure.

  6. #6
    byronbb
    byronbb's Avatar Become A Pro!
    Join Date: 11-13-08
    Posts: 3,067
    Betpoints: 2284

    yeah getting pinnacle's xml feed is going to be way less difficult than scraping sbrodds.

  7. #7
    Romanov
    Diapason Knells
    Romanov's Avatar Become A Pro!
    Join Date: 10-08-10
    Posts: 4,137
    Betpoints: 13

    i got pinnacles lines into excel using a web query. anybody know how to get kenpom's prediction win % under fanmatch into excel? I've been trying but Excel 2008 is a piece of chit. It is awful

  8. #8
    WendysRox
    WendysRox's Avatar Become A Pro!
    Join Date: 07-22-10
    Posts: 184
    Betpoints: 37

    I just did it at donbest and sbrsportsbook. The download from sbr wasn't near as pretty and excel had trouble seperating the lines and odds ( like "-1 -110" showed up in the same cell). But, it does work. Keep trying, bud. We're here if you need more help.

  9. #9
    WendysRox
    WendysRox's Avatar Become A Pro!
    Join Date: 07-22-10
    Posts: 184
    Betpoints: 37

    Not sure if this will paste properly, but I'll try. Also, KenPom's fanmatch page imported fine. But, again, the winning team, predicted score and win probability all showed up in the same cell. I assume it's because he put it in the same cell on the table. You *could* write a macro that would go into that cell and separate the data, but that would take a little longer to explain than a forum post from me.

    Rank Game Prediction Time (ET) Location Venue Thrill





    Score 23 241 Lamar at 266 Northwestern St. Northwestern St. 89-86 (59%) 4:00 PM Natchitoches, LA Prather Coliseum 32.9 4 43 Florida St. at 81 Georgia Tech Florida St. 66-65 (50%) 7:00 PM Atlanta, GA Alexander Memorial Coliseum 63.7 5 57 Penn St. at 51 Michigan St. Michigan St. 67-63 (67%) 7:00 PM East Lansing, MI Jack Breslin Student Events Ce 60

  10. #10
    WendysRox
    WendysRox's Avatar Become A Pro!
    Join Date: 07-22-10
    Posts: 184
    Betpoints: 37

    yeah, it didn't post right, as I suspected. But, you get the idea. In that first game, the cell containing the win prob looked like this: [Northwestern 89-86 (59%)] Again, you could write a macro that would take everything before the open parenthesis out. But, I'd have to spend some time on google to explain it thoroughly. I seem to forget VB as soon as I get the task at hand finished.

  11. #11
    Romanov
    Diapason Knells
    Romanov's Avatar Become A Pro!
    Join Date: 10-08-10
    Posts: 4,137
    Betpoints: 13

    Thanks for all the help wendy.

    So what are you doing exactly to get KenPom into excel?

    On Excel 2008 I have to go into textwrangler and write an iqy a la

    Web
    1
    url

    This works for a couple of the sites that I have tried, including Pinnacle's odds and it imports those perfectly, but when I try KenPom's I get an error message (a vague one at that). Should I be trying to direct excel to the table? How would I do that? I know how to get the table name but I don't know how the iqy is wrong.

  12. #12
    DirkDiggs
    Clippers ML!
    DirkDiggs's Avatar Become A Pro!
    Join Date: 12-07-10
    Posts: 484
    Betpoints: 12

    Would anyone be kind enough to explain to me how to write a query in excel to grab data from a webpage. I have excel 2004 and 2011. I can't seem to figure it out.

    If I go to Data>Get External Data>Run Query
    Excel says that I don't have an ODBC driver installed.

    Thanks in advance.

  13. #13
    Romanov
    Diapason Knells
    Romanov's Avatar Become A Pro!
    Join Date: 10-08-10
    Posts: 4,137
    Betpoints: 13

    Dirk Diggs. Try running a saved query. And create an iqy file in a text editor. if you search google, microsoft has some pages that explain in perfect detail.

  14. #14
    arwar
    arwar's Avatar Become A Pro!
    Join Date: 07-09-09
    Posts: 208
    Betpoints: 1544

    [quote=WendysRox;8753035]Not sure if this will paste properly, but I'll try. Also, KenPom's fanmatch page imported fine. But, again, the winning team, predicted score and win probability all showed up in the same cell. I assume it's because he put it in the same cell on the table. You *could* write a macro that would go into that cell and separate the data, but that would take a little longer to explain than a forum post from me.

    Rank Game Prediction Time (ET) Location Venue Thrill



    check out kenpom thread - i posted a link to a beta scraper

  15. #15
    Maverick22
    Maverick22's Avatar Become A Pro!
    Join Date: 04-10-10
    Posts: 807
    Betpoints: 58

    Here is a tip: Dont use python.

    Use a language that is more "freindly" for new programmers. Which is sounds like you are. (No insult intended).

    It is a very easy language to learn and pick up. But... If you are not aware of what is going on, and dont pay attention to variables. You can EASILY jack something up. Python can be very dangerous if you arent paying attention, and dont understand what python will do w/o you knowing it

    I always recommend new programs to shy away from python. Dont let the "this seems easy enough" fool you

  16. #16
    subs
    subs's Avatar Become A Pro!
    Join Date: 04-30-10
    Posts: 1,412
    Betpoints: 969

    i have a sheet that separates the teams and scores just using Left, RIGHT, MID etc. with some help from the forum.

    ganchrow even added something that i tried and tried to get right but sadly failed... may be useful to some1 as a template?

    teams
    =MID(B1,SEARCH(" ",B1,1)+1,SEARCH(" at ",B1,1))
    =LEFT(C1,SEARCH(" at",C1,1))
    =RIGHT(B1,LEN(B1)-SEARCH(IF(ISERR(FIND(" at ", B1, 1)), " vs ", " at "),B1,1)-3)
    =RIGHT(E1,LEN(E1)-SEARCH(" ",E1,1)+1)

    scores
    =MID(G1,SEARCH("-",G1,1)-2,SEARCH("-",G1,1)+2)
    =LEFT(H1,SEARCH("-",H1,1)-1)
    =MID(H1,SEARCH("-",H1,1)+1,SEARCH(" ",H1,1)-4)

    LOL, prolly make an excel teacher cringe but it works 4 me. BTW things have really dried up. moved on now...


    good luck.
    Last edited by subs; 02-26-11 at 07:51 AM.

  17. #17
    Saab
    Saab's Avatar Become A Pro!
    Join Date: 03-01-09
    Posts: 80
    Betpoints: 91

    id have to disagree with you in so many ways on that. python as a language skips out on a lot of bullshit you have to deal with in other languages. What are you going to recommend? Don't do c++ (pointers), don't do C (char arrays anyone?), if you say java id say python over it any day (readability, increased efficiency in lines of code to results), c# requires visual studio, the list goes on.

    if this guy wants to learn to program, python is as good as any and imo better to learn with. if he just wants to fill out some cells in excel, then run a query through excel...

  18. #18
    jairocon
    jairocon's Avatar Become A Pro!
    Join Date: 05-30-10
    Posts: 446
    Betpoints: 260

    Quote Originally Posted by Maverick22 View Post
    Here is a tip: Dont use python....I always recommend new programs to shy away from python. Dont let the "this seems easy enough" fool you
    So what do you recommend then? I've been thinking about picking up a programming language as a hobby and everyone I talked to said pick java...

  19. #19
    Indecent
    Indecent's Avatar Become A Pro!
    Join Date: 09-08-09
    Posts: 758
    Betpoints: 1156

    Quote Originally Posted by jairocon View Post
    So what do you recommend then? I've been thinking about picking up a programming language as a hobby and everyone I talked to said pick java...
    Java, C#, Python, PHP, any of these will serve you well. Plenty of documentation/help available online where you are stuck, and all are relatively easy to get off the ground with. For an absolute beginner, Python would be my recommendation, with C# or Java next.

  20. #20
    TCMBob
    Whatever "JQ" isn't playin'
    TCMBob's Avatar Become A Pro!
    Join Date: 01-16-11
    Posts: 43
    Betpoints: 240

    Quote Originally Posted by WendysRox View Post
    i use excel 2003 to scrape data from a few sites. Don't know how to do it in newer versions (because I like 2003 the best), but here's how I do it: Go to the "data" menu, then click "import external data", then choose "new web query". It will open a browser window with little yellow boxes next to every table. just click the box next to the table containing the data you want and click "ok" or "import". If you'll then save the spreadsheet, then the next time you open it all you have to do is "data" and "refresh data". I saved mine as "NBA Template" and "NCAAB Template". Every day, I'll open these templates, get the data, then save the updated sheet as "NBA 02-09-2011" or "NCAAB 02-09-2011" or whatever. Works great for me. Good luck!
    Thanks for the heads up on this wendy

  21. #21
    kpoutlaw
    kpoutlaw's Avatar Become A Pro!
    Join Date: 09-24-10
    Posts: 53

    Hey guys, I am a complete novice when it comes to computer programming and using computer programs in general, and I just learned about "web scraping" data....But I am fiercely determined to learn! So can anyone give me advice as to where to begin? I just learned the basic functions of Excel like adding and subtracting..What else do I need to learn?? Should I also learn this Python language...using the link in the "Introduction to Research" thread??

    any thoughts or suggestions would be greatly appreciated

    thanks in advance

  22. #22
    Borat38
    Borat38's Avatar Become A Pro!
    Join Date: 10-15-10
    Posts: 177
    Betpoints: 132

    ^Intro to Research thread would be a good start. I started learning about Python there.

  23. #23
    Saab
    Saab's Avatar Become A Pro!
    Join Date: 03-01-09
    Posts: 80
    Betpoints: 91

    great intro to python (but not necessarily programming in general!) here: http://www.markus-gattol.name/ws/python.html


    I use it everyday for both business related and personal projects, both related to sports and gambling, and other things. It is a versatile language and I love it more everyday. That being said, there are always times and cases where you might need something different.

  24. #24
    TCMBob
    Whatever "JQ" isn't playin'
    TCMBob's Avatar Become A Pro!
    Join Date: 01-16-11
    Posts: 43
    Betpoints: 240

    Quote Originally Posted by kpoutlaw View Post
    Hey guys, I am a complete novice when it comes to computer programming and using computer programs in general, and I just learned about "web scraping" data....But I am fiercely determined to learn! So can anyone give me advice as to where to begin? I just learned the basic functions of Excel like adding and subtracting..What else do I need to learn?? Should I also learn this Python language...using the link in the "Introduction to Research" thread??

    any thoughts or suggestions would be greatly appreciated

    thanks in advance
    http://search.avg.com/?d=4d220162&v=...acro&lng=en-US

    this is what my web search turned up. sort through and pick a site.

  25. #25
    kpoutlaw
    kpoutlaw's Avatar Become A Pro!
    Join Date: 09-24-10
    Posts: 53

    ok ..so you're saying that i should learn about excel macros...i don't even know what a macro is..lol..but i guess ill find out soon enough...thanks

Top