Help Creating Line Scraper in Python

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Romanov
    SBR MVP
    • 10-08-10
    • 4137

    #1
    Help Creating Line Scraper in Python
    Hi,

    I've recently learned a little about Python. Basic BASIC stuff. I would like to learn how to scrape sbrodds or pinnacle's odds (and convert them from decimal form) and then input that data into an excel spreadsheet. Any guidance you could lend would be great. Thanks
  • WendysRox
    SBR High Roller
    • 07-22-10
    • 184

    #2
    i use excel 2003 to scrape data from a few sites. Don't know how to do it in newer versions (because I like 2003 the best), but here's how I do it: Go to the "data" menu, then click "import external data", then choose "new web query". It will open a browser window with little yellow boxes next to every table. just click the box next to the table containing the data you want and click "ok" or "import". If you'll then save the spreadsheet, then the next time you open it all you have to do is "data" and "refresh data". I saved mine as "NBA Template" and "NCAAB Template". Every day, I'll open these templates, get the data, then save the updated sheet as "NBA 02-09-2011" or "NCAAB 02-09-2011" or whatever. Works great for me. Good luck!
    Comment
    • mark49
      SBR Rookie
      • 03-03-08
      • 42

      #3
      like WendysRox says, you may not need to do this through Python.

      It all depends on what you want to do exactly.
      Do you want odds at a specific time, Opening or Closing or every point in-between ?
      What do you need your spreadsheet to record ?
      How accurately do you want the odds recording ? and converted to which format ?

      I personally would just paste Pinnacles lines into a worksheet and then use a macro to store the information into whatever format I wanted. Very simple to do.
      Comment
      • Romanov
        SBR MVP
        • 10-08-10
        • 4137

        #4
        Okay, I'm using excel 2008 for mac so its a little different. I can get excel to TRY and download from the web, but I do not know the address/ parameters for sbrodds in order to pull that info
        Comment
        • mike1234
          SBR Sharp
          • 09-06-07
          • 457

          #5
          I think sbrodds uses AJAX. It might be more difficult to scrape - but really not sure.
          Comment
          • byronbb
            SBR MVP
            • 11-13-08
            • 3067

            #6
            yeah getting pinnacle's xml feed is going to be way less difficult than scraping sbrodds.
            Comment
            • Romanov
              SBR MVP
              • 10-08-10
              • 4137

              #7
              i got pinnacles lines into excel using a web query. anybody know how to get kenpom's prediction win % under fanmatch into excel? I've been trying but Excel 2008 is a piece of chit. It is awful
              Comment
              • WendysRox
                SBR High Roller
                • 07-22-10
                • 184

                #8
                I just did it at donbest and sbrsportsbook. The download from sbr wasn't near as pretty and excel had trouble seperating the lines and odds ( like "-1 -110" showed up in the same cell). But, it does work. Keep trying, bud. We're here if you need more help.
                Comment
                • WendysRox
                  SBR High Roller
                  • 07-22-10
                  • 184

                  #9
                  Not sure if this will paste properly, but I'll try. Also, KenPom's fanmatch page imported fine. But, again, the winning team, predicted score and win probability all showed up in the same cell. I assume it's because he put it in the same cell on the table. You *could* write a macro that would go into that cell and separate the data, but that would take a little longer to explain than a forum post from me.

                  Rank Game Prediction Time (ET) Location Venue Thrill





                  Score 23 241 Lamar at 266 Northwestern St. Northwestern St. 89-86 (59%) 4:00 PM Natchitoches, LA Prather Coliseum 32.9 4 43 Florida St. at 81 Georgia Tech Florida St. 66-65 (50%) 7:00 PM Atlanta, GA Alexander Memorial Coliseum 63.7 5 57 Penn St. at 51 Michigan St. Michigan St. 67-63 (67%) 7:00 PM East Lansing, MI Jack Breslin Student Events Ce 60
                  Comment
                  • WendysRox
                    SBR High Roller
                    • 07-22-10
                    • 184

                    #10
                    yeah, it didn't post right, as I suspected. But, you get the idea. In that first game, the cell containing the win prob looked like this: [Northwestern 89-86 (59%)] Again, you could write a macro that would take everything before the open parenthesis out. But, I'd have to spend some time on google to explain it thoroughly. I seem to forget VB as soon as I get the task at hand finished.
                    Comment
                    • Romanov
                      SBR MVP
                      • 10-08-10
                      • 4137

                      #11
                      Thanks for all the help wendy.

                      So what are you doing exactly to get KenPom into excel?

                      On Excel 2008 I have to go into textwrangler and write an iqy a la

                      Web
                      1
                      url

                      This works for a couple of the sites that I have tried, including Pinnacle's odds and it imports those perfectly, but when I try KenPom's I get an error message (a vague one at that). Should I be trying to direct excel to the table? How would I do that? I know how to get the table name but I don't know how the iqy is wrong.
                      Comment
                      • DirkDiggs
                        SBR Sharp
                        • 12-07-10
                        • 484

                        #12
                        Would anyone be kind enough to explain to me how to write a query in excel to grab data from a webpage. I have excel 2004 and 2011. I can't seem to figure it out.

                        If I go to Data>Get External Data>Run Query
                        Excel says that I don't have an ODBC driver installed.

                        Thanks in advance.
                        Comment
                        • Romanov
                          SBR MVP
                          • 10-08-10
                          • 4137

                          #13
                          Dirk Diggs. Try running a saved query. And create an iqy file in a text editor. if you search google, microsoft has some pages that explain in perfect detail.
                          Comment
                          • arwar
                            SBR High Roller
                            • 07-09-09
                            • 208

                            #14
                            [quote=WendysRox;8753035]Not sure if this will paste properly, but I'll try. Also, KenPom's fanmatch page imported fine. But, again, the winning team, predicted score and win probability all showed up in the same cell. I assume it's because he put it in the same cell on the table. You *could* write a macro that would go into that cell and separate the data, but that would take a little longer to explain than a forum post from me.

                            Rank Game Prediction Time (ET) Location Venue Thrill



                            check out kenpom thread - i posted a link to a beta scraper
                            Comment
                            • Maverick22
                              SBR Wise Guy
                              • 04-10-10
                              • 807

                              #15
                              Here is a tip: Dont use python.

                              Use a language that is more "freindly" for new programmers. Which is sounds like you are. (No insult intended).

                              It is a very easy language to learn and pick up. But... If you are not aware of what is going on, and dont pay attention to variables. You can EASILY jack something up. Python can be very dangerous if you arent paying attention, and dont understand what python will do w/o you knowing it

                              I always recommend new programs to shy away from python. Dont let the "this seems easy enough" fool you
                              Comment
                              • subs
                                SBR MVP
                                • 04-30-10
                                • 1412

                                #16
                                i have a sheet that separates the teams and scores just using Left, RIGHT, MID etc. with some help from the forum.

                                ganchrow even added something that i tried and tried to get right but sadly failed... may be useful to some1 as a template?

                                teams
                                =MID(B1,SEARCH(" ",B1,1)+1,SEARCH(" at ",B1,1))
                                =LEFT(C1,SEARCH(" at",C1,1))
                                =RIGHT(B1,LEN(B1)-SEARCH(IF(ISERR(FIND(" at ", B1, 1)), " vs ", " at "),B1,1)-3)
                                =RIGHT(E1,LEN(E1)-SEARCH(" ",E1,1)+1)

                                scores
                                =MID(G1,SEARCH("-",G1,1)-2,SEARCH("-",G1,1)+2)
                                =LEFT(H1,SEARCH("-",H1,1)-1)
                                =MID(H1,SEARCH("-",H1,1)+1,SEARCH(" ",H1,1)-4)

                                LOL, prolly make an excel teacher cringe but it works 4 me. BTW things have really dried up. moved on now...


                                good luck.
                                Last edited by subs; 02-26-11, 08:51 AM.
                                Comment
                                • Saab
                                  SBR Hustler
                                  • 03-01-09
                                  • 80

                                  #17
                                  id have to disagree with you in so many ways on that. python as a language skips out on a lot of bullshit you have to deal with in other languages. What are you going to recommend? Don't do c++ (pointers), don't do C (char arrays anyone?), if you say java id say python over it any day (readability, increased efficiency in lines of code to results), c# requires visual studio, the list goes on.

                                  if this guy wants to learn to program, python is as good as any and imo better to learn with. if he just wants to fill out some cells in excel, then run a query through excel...
                                  Comment
                                  • jairocon
                                    SBR Sharp
                                    • 05-30-10
                                    • 446

                                    #18
                                    Originally posted by Maverick22
                                    Here is a tip: Dont use python....I always recommend new programs to shy away from python. Dont let the "this seems easy enough" fool you
                                    So what do you recommend then? I've been thinking about picking up a programming language as a hobby and everyone I talked to said pick java...
                                    Comment
                                    • Indecent
                                      SBR Wise Guy
                                      • 09-08-09
                                      • 758

                                      #19
                                      Originally posted by jairocon
                                      So what do you recommend then? I've been thinking about picking up a programming language as a hobby and everyone I talked to said pick java...
                                      Java, C#, Python, PHP, any of these will serve you well. Plenty of documentation/help available online where you are stuck, and all are relatively easy to get off the ground with. For an absolute beginner, Python would be my recommendation, with C# or Java next.
                                      Comment
                                      • TCMBob
                                        SBR Rookie
                                        • 01-16-11
                                        • 43

                                        #20
                                        Originally posted by WendysRox
                                        i use excel 2003 to scrape data from a few sites. Don't know how to do it in newer versions (because I like 2003 the best), but here's how I do it: Go to the "data" menu, then click "import external data", then choose "new web query". It will open a browser window with little yellow boxes next to every table. just click the box next to the table containing the data you want and click "ok" or "import". If you'll then save the spreadsheet, then the next time you open it all you have to do is "data" and "refresh data". I saved mine as "NBA Template" and "NCAAB Template". Every day, I'll open these templates, get the data, then save the updated sheet as "NBA 02-09-2011" or "NCAAB 02-09-2011" or whatever. Works great for me. Good luck!
                                        Thanks for the heads up on this wendy
                                        Comment
                                        • kpoutlaw
                                          SBR Hustler
                                          • 09-24-10
                                          • 53

                                          #21
                                          Hey guys, I am a complete novice when it comes to computer programming and using computer programs in general, and I just learned about "web scraping" data....But I am fiercely determined to learn! So can anyone give me advice as to where to begin? I just learned the basic functions of Excel like adding and subtracting..What else do I need to learn?? Should I also learn this Python language...using the link in the "Introduction to Research" thread??

                                          any thoughts or suggestions would be greatly appreciated

                                          thanks in advance
                                          Comment
                                          • Borat38
                                            SBR High Roller
                                            • 10-15-10
                                            • 177

                                            #22
                                            ^Intro to Research thread would be a good start. I started learning about Python there.
                                            Comment
                                            • Saab
                                              SBR Hustler
                                              • 03-01-09
                                              • 80

                                              #23
                                              great intro to python (but not necessarily programming in general!) here: http://www.markus-gattol.name/ws/python.html


                                              I use it everyday for both business related and personal projects, both related to sports and gambling, and other things. It is a versatile language and I love it more everyday. That being said, there are always times and cases where you might need something different.
                                              Comment
                                              • TCMBob
                                                SBR Rookie
                                                • 01-16-11
                                                • 43

                                                #24
                                                Originally posted by kpoutlaw
                                                Hey guys, I am a complete novice when it comes to computer programming and using computer programs in general, and I just learned about "web scraping" data....But I am fiercely determined to learn! So can anyone give me advice as to where to begin? I just learned the basic functions of Excel like adding and subtracting..What else do I need to learn?? Should I also learn this Python language...using the link in the "Introduction to Research" thread??

                                                any thoughts or suggestions would be greatly appreciated

                                                thanks in advance


                                                this is what my web search turned up. sort through and pick a site.
                                                Comment
                                                • kpoutlaw
                                                  SBR Hustler
                                                  • 09-24-10
                                                  • 53

                                                  #25
                                                  ok ..so you're saying that i should learn about excel macros...i don't even know what a macro is..lol..but i guess ill find out soon enough...thanks
                                                  Comment
                                                  SBR Contests
                                                  Collapse
                                                  Top-Rated US Sportsbooks
                                                  Collapse
                                                  Working...