Scraping SBObet, worth a shot?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • pringles
    SBR Rookie
    • 11-26-12
    • 41

    #1
    Scraping SBObet, worth a shot?
    SBObet only offers XML feed to bigger affiliates.
    As I am a simple bettor, but still need the numbers, ive talked to a skilled programmer to write me a scraper.

    The problem is whether i will get blocked and if not, after how many minutes should i scrape the lines?

    I appreciate your answers/suggestions

    //

    I would like to know if scraping the main lines (AH, totals) on all the soccer leagues is too much data (excuse me, i dont know much about programming, but do you still have to scrape the whole page in order to do that?) ... will i get blocked if i do this ... lets say once every few minutes.


  • HUY
    SBR Sharp
    • 04-29-09
    • 253

    #2
    Originally posted by pringles
    SBObet only offers XML feed to bigger affiliates.
    As I am a simple bettor, but still need the numbers, ive talked to a skilled programmer to write me a scraper.

    The problem is whether i will get blocked and if not, after how many minutes should i scrape the lines?

    I appreciate your answers/suggestions

    //

    I would like to know if scraping the main lines (AH, totals) on all the soccer leagues is too much data (excuse me, i dont know much about programming, but do you still have to scrape the whole page in order to do that?) ... will i get blocked if i do this ... lets say once every few minutes.


    You won't have any problems with the amount of data. You will be blocked if you request pages too often. Just download every few minutes and you should be fine. Also, send the User-Agent string of a well-known browser while you download stuff. If you do get blocked despite all that then a quick reset of the router should get you a new IP and sidestep the ban.

    I've written scrapers for many websites and I can scrape SBObet for you, contact me via PM so we can discuss pricing if you're interested.
    Comment
    • hubie69
      SBR Hall of Famer
      • 09-16-10
      • 7329

      #3
      And really it doesn't even take a skilled programmer to do it.
      Comment
      • pringles
        SBR Rookie
        • 11-26-12
        • 41

        #4
        Originally posted by hubie69
        And really it doesn't even take a skilled programmer to do it.
        true
        Comment
        • HUY
          SBR Sharp
          • 04-29-09
          • 253

          #5
          Originally posted by pringles
          true
          You don't need to be a star programmer simply to scrape a website, but you do need to have firm understanding of the technologies involved in order to make your program resistant to bans, resistant to connection unreliability and feasible to be run 24/7/365 (i.e. to use propper logging, automatically restart it in case the host server is rebooted etc.)

          As always, the devil is in the details.
          Comment
          • SportsInsights
            SBR High Roller
            • 01-05-09
            • 119

            #6
            SBO is a difficult site to scrape. You'll need an account as well as the ability to use a proxy network. They monitor usage.
            Comment
            • hubie69
              SBR Hall of Famer
              • 09-16-10
              • 7329

              #7
              Originally posted by HUY
              You don't need to be a star programmer simply to scrape a website, but you do need to have firm understanding of the technologies involved in order to make your program resistant to bans, resistant to connection unreliability and feasible to be run 24/7/365 (i.e. to use propper logging, automatically restart it in case the host server is rebooted etc.)

              As always, the devil is in the details.
              I disagree to a point, but it depends on the OS and what you code in. Parsing XML relatively error free can be done in under 50 lines of code, at least on a linux box using either bash or php. The ban resistance is easy as there really is no need to scrape more than every 4 or 5 minutes, I doubt you'll get banned for that type of activity. And it doesnt need to run 24/7/365, you can do it on a per sport basis and just cronjob it for once every X minutes. As well, you can just use curl or wget to grab the entire xml site and run the parsing locally.

              Hooray for Linux
              Comment
              • hubie69
                SBR Hall of Famer
                • 09-16-10
                • 7329

                #8
                Originally posted by SportsInsights
                SBO is a difficult site to scrape. You'll need an account as well as the ability to use a proxy network. They monitor usage.

                You could use Hide My A$$ for this, its free
                Comment
                • HUY
                  SBR Sharp
                  • 04-29-09
                  • 253

                  #9
                  Originally posted by hubie69
                  I disagree to a point, but it depends on the OS and what you code in. Parsing XML relatively error free can be done in under 50 lines of code, at least on a linux box using either bash or php. The ban resistance is easy as there really is no need to scrape more than every 4 or 5 minutes, I doubt you'll get banned for that type of activity. And it doesnt need to run 24/7/365, you can do it on a per sport basis and just cronjob it for once every X minutes. As well, you can just use curl or wget to grab the entire xml site and run the parsing locally.

                  Hooray for Linux
                  You don't need linux to parse xml, run bash, run php, run wget or run curl. Try to contribute something to the thread please.
                  Comment
                  • Fair
                    SBR High Roller
                    • 11-25-10
                    • 216

                    #10
                    sorry... but why scraping data from a website when you have others sites (like oddsportal) that have all the lines movements?
                    Comment
                    • HUY
                      SBR Sharp
                      • 04-29-09
                      • 253

                      #11
                      Originally posted by Fair
                      sorry... but why scraping data from a website when you have others sites (like oddsportal) that have all the lines movements?
                      So you should scrape oddsportal instead, is that what you are saying?
                      Comment
                      • hubie69
                        SBR Hall of Famer
                        • 09-16-10
                        • 7329

                        #12
                        Originally posted by HUY
                        You don't need linux to parse xml, run bash, run php, run wget or run curl. Try to contribute something to the thread please.
                        No, you don't need linux to do it. It makes it easier once you learn it though. Simply stating that it may be helpful for the op to use Linux if he currently doesn't. If that doesn't contribute enough for you, I also chimed in with ban resistances, from what I've found to be true over the past few years of scraping XML data myself. Learn to not only read my post, but understand the words that are typed in it before you slander me.
                        Comment
                        • strixee
                          SBR Sharp
                          • 05-31-10
                          • 432

                          #13
                          pringles, how much such a scraper approximately costs?
                          Comment
                          • pringles
                            SBR Rookie
                            • 11-26-12
                            • 41

                            #14
                            Originally posted by strixee
                            pringles, how much such a scraper approximately costs?
                            well, im using both very skilled designer and programmer, we have done the interface and the programming starts in a few days.
                            im paying around 3k€ for the whole set-up
                            Comment
                            • Maverick22
                              SBR Wise Guy
                              • 04-10-10
                              • 807

                              #15
                              You are paying 3000€ for a website scraper? That only scrapes one site?
                              Comment
                              • sideloaded
                                SBR Hall of Famer
                                • 08-21-10
                                • 7561

                                #16
                                Originally posted by HUY
                                You don't need linux to parse xml, run bash, run php, run wget or run curl. Try to contribute something to the thread please.
                                why on earth would you do all that on something NOT based on linux? You setting up your ultra complex scraper on Solaris?
                                Comment
                                • sideloaded
                                  SBR Hall of Famer
                                  • 08-21-10
                                  • 7561

                                  #17
                                  Originally posted by pringles
                                  well, im using both very skilled designer and programmer, we have done the interface and the programming starts in a few days.
                                  im paying around 3k€ for the whole set-up
                                  You're over paying. No need for a skilled programmer for this. Hire a 9th grader and buy him a 3ds or something.
                                  Comment
                                  • HUY
                                    SBR Sharp
                                    • 04-29-09
                                    • 253

                                    #18
                                    Originally posted by sideloaded
                                    why on earth would you do all that on something NOT based on linux? You setting up your ultra complex scraper on Solaris?
                                    Cygwin.
                                    Comment
                                    • sideloaded
                                      SBR Hall of Famer
                                      • 08-21-10
                                      • 7561

                                      #19
                                      yeah but if you're scraping 99 percent of the time you are deploying to a vps running linux


                                      cygwin and windows is just gross
                                      Comment
                                      • Fair
                                        SBR High Roller
                                        • 11-25-10
                                        • 216

                                        #20
                                        i mean... if you are interested in lines movement, there are a lot af site that offer all the information that you want, all the historical data from so many boookies. So why pay 3000$ for an information that is avaiable for free? In the end... if you do this for bet and for earn some money... you start with a bankroll of -30000 ... are you kidding me?
                                        Comment
                                        • hubie69
                                          SBR Hall of Famer
                                          • 09-16-10
                                          • 7329

                                          #21
                                          Originally posted by HUY
                                          Cygwin.
                                          Sorry bud but why try to put a linux layer over the top of windows when you can simply buy a 5$ machine from a garage sale and actually run linux? Scraping requires virtually 0 resources and if done on actual linux it's portable to any *nix based box. Not judging, just being a Linux Admin and a Network admin as my living, it seems odd.
                                          Comment
                                          • hubie69
                                            SBR Hall of Famer
                                            • 09-16-10
                                            • 7329

                                            #22
                                            Originally posted by sideloaded
                                            yeah but if you're scraping 99 percent of the time you are deploying to a vps running linux


                                            cygwin and windows is just gross

                                            Comment
                                            • HUY
                                              SBR Sharp
                                              • 04-29-09
                                              • 253

                                              #23
                                              Originally posted by hubie69
                                              Sorry bud but why try to put a linux layer over the top of windows when you can simply buy a 5$ machine from a garage sale and actually run linux? Scraping requires virtually 0 resources and if done on actual linux it's portable to any *nix based box. Not judging, just being a Linux Admin and a Network admin as my living, it seems odd.
                                              More machines = more problems.

                                              Also, I'm working on a laptop and linux does not play very well with laptops.
                                              Comment
                                              • Maverick22
                                                SBR Wise Guy
                                                • 04-10-10
                                                • 807

                                                #24
                                                Dude... go to a pawn shop. Find the cheapest computer you can find. Put linux on it. Deploy all your code there. Then thank us later
                                                Comment
                                                • Maverick22
                                                  SBR Wise Guy
                                                  • 04-10-10
                                                  • 807

                                                  #25
                                                  Plus... a dedicated server running a scraper makes your life easier... not harder.

                                                  Sometimes more computers is more complexity... but not in this case. Not in this case at all.
                                                  Comment
                                                  • pringles
                                                    SBR Rookie
                                                    • 11-26-12
                                                    • 41

                                                    #26
                                                    Originally posted by Maverick22
                                                    You are paying 3000€ for a website scraper? That only scrapes one site?
                                                    Im using designer + initial programmer for a scraper that takes everything, lines and statistics and writes a db to my server.
                                                    Then a second programmer to add algorithms and make an Iphone app with alerts.
                                                    Comment
                                                    • Maverick22
                                                      SBR Wise Guy
                                                      • 04-10-10
                                                      • 807

                                                      #27
                                                      I would have a conversation with each developer and "designer".

                                                      After the whole thing is finished, I would try to get a copy of all the source code, including all the database scripts. and documentation (For those prices, it better come with some documentation)

                                                      Since you are paying for it, you (should) own it.

                                                      You might not think you will need it, but you may one day. And you will not want to chase down a guy for the code years later.

                                                      Just my thoughts anyways.
                                                      Comment
                                                      • HUY
                                                        SBR Sharp
                                                        • 04-29-09
                                                        • 253

                                                        #28
                                                        Originally posted by pringles
                                                        Im using designer + initial programmer for a scraper that takes everything, lines and statistics and writes a db to my server.
                                                        Then a second programmer to add algorithms and make an Iphone app with alerts.
                                                        What will those "algorithms" do? Tell you what to bet? If so, I have a whole new world waiting for you: They're called "touts".
                                                        Comment
                                                        • SportsInsights
                                                          SBR High Roller
                                                          • 01-05-09
                                                          • 119

                                                          #29
                                                          Just so you know, SBOBet offers an XML feed.
                                                          Comment
                                                          • arwar
                                                            SBR High Roller
                                                            • 07-09-09
                                                            • 208

                                                            #30
                                                            well i have been writing scrapers for years. i have written so many of them i can do it in my sleep. for 4 years now i have been running a scraper against a popular site that runs every 3 minutes, 24/7/365. even though respected posters on here said you can get banned for over use there, it has never happen on this site to me. i have a little random routine that adds 0-60 seconds to 2 minutes between scrapes. if it were to hit exactly every 3 minutes, it might attract attention. this site has an RSS feed, but i find it lags behind the real time data. i have scraped hundreds of different sites, both for historical and real time data. the only time i have ever been booted was once on yahoo. if i remember correctly that had some kind of weird numbering convention for MLB games and so in attempt to grab all the data i ran some kind of loop like 100000 to 600000 and it returned a lot of 404 (page not found) errors. this apparently attracted the attention of some sysop and gave me some kind of ping of death. i was able to get back (i have static ip from my ISP) in after 20 minutes, and after adjusting the logic of my program so it only requested pages that could actually be served, i never had any more problems. So far as whoever posted about getting a different IP address, usually the DHCP server will assign a lease with the same IP to the same MAC address if possible. It used to be that it was always different, not so much anymore. the weirdest scraper i wrote was for some guy betting tennis at an offshore book that changed the line on the match after every point. i couldn't figure out how he could get a bet down between points?? i guess he just bet between games. some of these scrapers get very complicated - with all the advanced javascript, even readystate=4 doesn't work. i am competent with linux, but curl isn't going to return data that's not there. i am curious now - can somebody post the url to this site?
                                                            Comment
                                                            • arwar
                                                              SBR High Roller
                                                              • 07-09-09
                                                              • 208

                                                              #31
                                                              well i zipped over to SBObet.com and took a look at it. i think the guy wanted soccer which of course is football. i didn't need any account to be able to see lines, but it looked like they had tons of different leagues - Japan, etc. and a shitload of games. i didn't look under the hood at the way the way the site was coded, but it would be relatively simple to scrape. this is live odds, so i am not quite sure what the guy is looking for in a scraper. maybe line movements? otherwise just go there and look at the odds (decimal btw) . i saw at the top of the odds page a bunch of different days - i didn't go in there tho. the only soccer scrapers i wrote before tracked results. the scraping itself is simple. tracking all the different leagues, etc. adds a lot of overhead.
                                                              Comment
                                                              • slobib
                                                                SBR Rookie
                                                                • 07-26-06
                                                                • 43

                                                                #32
                                                                I would like a scraper, unfortunately i dont have enough posts and cant send PMs. Anyone willing to write itfor a payment please send me a PM for details.
                                                                Thanks.
                                                                Comment
                                                                • aramakilx
                                                                  SBR High Roller
                                                                  • 01-18-13
                                                                  • 195

                                                                  #33
                                                                  Its impossible to obtain xml feed for sipmle user, just if you have rich site like sbr. What kind of odds do you need: pre-game or live? May be you can look for another bookie?
                                                                  Comment
                                                                  • slobib
                                                                    SBR Rookie
                                                                    • 07-26-06
                                                                    • 43

                                                                    #34
                                                                    Pre-game odds. Maybe from other asian bookies or pinnacle.
                                                                    Comment
                                                                    • slobib
                                                                      SBR Rookie
                                                                      • 07-26-06
                                                                      • 43

                                                                      #35
                                                                      I need it to compare with local bookies and get the best odds fastest.
                                                                      There are sites that do this but i think they lack quality and arent optimal.
                                                                      Comment
                                                                      Search
                                                                      Collapse
                                                                      SBR Contests
                                                                      Collapse
                                                                      Top-Rated US Sportsbooks
                                                                      Collapse
                                                                      Working...