1. #1
    strixee
    I think, therefore I win
    strixee's Avatar Become A Pro!
    Join Date: 05-31-10
    Posts: 432

    Scraping data from Oddsportal

    I'd like to scrape some odds from Oddsportal, but the data is stored in some XML database or what.
    The Javascript code that manipulates it has to be this 0.5 MB monster
    Code:
    http://www.oddsportal.com/res/x/proto-1108031134.js
    I don't think it's easy to acces the database directly, but at least gather the data using AJAX requests. Ha anyone worked with OP? Or do you have any advice how to research it? Should I start from functions such as
    Code:
    XMLHttpRequest, Ajax.Request
    ?

  2. #2
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    screen scrape it, send an XML request to the web address and you'll get a response that you can then navigate the html file by accessing the properties

  3. #3
    strixee
    I think, therefore I win
    strixee's Avatar Become A Pro!
    Join Date: 05-31-10
    Posts: 432

    You mean using cURL or some already made application for the screen scraping?

  4. #4
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    in vb

    Set XMLHttpRequest = New MSXML2.XMLHTTP
    XMLHttpRequest.Open "GET", URL, False
    XMLHttpRequest.send

    Set HTMLDoc = New HTMLDocument
    HTMLDoc.body.innerHTML = XMLHttpRequest.responseText

  5. #5
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    treat the HTML like an XML request

  6. #6
    vyomguy
    vyomguy's Avatar Become A Pro!
    Join Date: 12-08-09
    Posts: 5,794
    Betpoints: 234

    try curl.

  7. #7
    Pot luck
    Pot luck's Avatar Become A Pro!
    Join Date: 05-05-11
    Posts: 40
    Betpoints: 788

    In php

    Get the HTML as a string using curl:
    PHP Code:
    $url "www.google.com";
    $ch curl_init(); // create curl resource
    curl_setopt($chCURLOPT_URL$url); // set url 
    curl_setopt($chCURLOPT_RETURNTRANSFER1); //return the transfer as a string 
    curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,2); // sets timeout to 2 seconds
    $output curl_exec($ch); 
    curl_close($ch); // close curl resource to free up system resources 
    Parse into a DOM object:
    PHP Code:
    $doc = new domDocument;
    $doc->loadHTML($output); 
    Last edited by Pot luck; 08-16-11 at 08:27 PM.

  8. #8
    strixee
    I think, therefore I win
    strixee's Avatar Become A Pro!
    Join Date: 05-31-10
    Posts: 432

    Pot luck, this simple method works for non AJAX websites only.

  9. #9
    Pot luck
    Pot luck's Avatar Become A Pro!
    Join Date: 05-05-11
    Posts: 40
    Betpoints: 788

    Yeah true. I am scraping from the non-AJAX bit of oddsportal (eg http://www.oddsportal.com/matches/). What do you want to get from there? I emailed them a while back and asked how I could get this AJAX-fetched data but got a response along the lines of "huh, do what I dunno?".

    Interested to see if you get further with this.

  10. #10
    strixee
    I think, therefore I win
    strixee's Avatar Become A Pro!
    Join Date: 05-31-10
    Posts: 432

    I want to get current odds of a few bookies, mainly SBO, 188 and Pinnacle.
    The only easy thing is switching between the markets. For example for O/U 1st half you just need to add #over-under;3 to the URL.

  11. #11
    Pot luck
    Pot luck's Avatar Become A Pro!
    Join Date: 05-05-11
    Posts: 40
    Betpoints: 788

    Maybe hiring a freelancer would be the way to go.

Top