1. #1
    illfuuptn
    illfuuptn's Avatar Become A Pro!
    Join Date: 03-17-10
    Posts: 1,860

    Why is scrape not working?

    Idk why it is returning this: [' '] but it is and it makes no sense.


    webpage=urlopen('http://www.sportsbookreview.com/betting-odds/ncaa-basketball/?date=20120102').read()
    findTeamName= re.compile('"team-name".*>(.*) find_it = re.findall(findTeamName,webpage)
    print find_it
    [' ']


    I've checked tutorials and the pages they scrape work fine when I do them. So what's up with this?
    Last edited by illfuuptn; 12-09-12 at 01:08 AM. Reason: find_it should be on new line btw

  2. #2
    illfuuptn
    illfuuptn's Avatar Become A Pro!
    Join Date: 03-17-10
    Posts: 1,860

    sorry for no code tags but when I put them on it deletes stuff for some reason

  3. #3
    illfuuptn
    illfuuptn's Avatar Become A Pro!
    Join Date: 03-17-10
    Posts: 1,860

    Can't post source code because sbr reads it. So I guess you can look for yourself if you want to double check
    Last edited by illfuuptn; 12-09-12 at 01:05 AM.

  4. #4
    illfuuptn
    illfuuptn's Avatar Become A Pro!
    Join Date: 03-17-10
    Posts: 1,860

    any help appreciated

  5. #5
    Maverick22
    Maverick22's Avatar Become A Pro!
    Join Date: 04-10-10
    Posts: 807
    Betpoints: 58

    Whatever language that is... I don't know it. Sorry, that I am not able to help you

  6. #6
    illfuuptn
    illfuuptn's Avatar Become A Pro!
    Join Date: 03-17-10
    Posts: 1,860

    Quote Originally Posted by Maverick22 View Post
    Whatever language that is... I don't know it. Sorry, that I am not able to help you
    That's quite alright Maverick. You've helped me many times in the past. It's python for the record.

  7. #7
    Blax0r
    Blax0r's Avatar Become A Pro!
    Join Date: 10-13-10
    Posts: 688
    Betpoints: 1512

    Hey illfuuptn, I'm not very familiar with python, but I think you should try adding a question mark after the .* in your regular expression.

    So it would look like this: findTeamName= re.compile('"team-name".*?>(.*?)). Also, you may need to put something after the (.*?), or risk saving ALL the text after team name.

    The question mark makes the regular expression "not greedy";
    http://www.regular-expressions.info/optional.html.

    Hope this helps.

  8. #8
    EXhoosier10
    EXhoosier10's Avatar Become A Pro!
    Join Date: 07-06-09
    Posts: 3,122
    Betpoints: 4390

    Quote Originally Posted by illfuuptn View Post
    Idk why it is returning this: [' '] but it is and it makes no sense.


    webpage=urlopen('http://www.sportsbookreview.com/betting-odds/ncaa-basketball/?date=20120102').read()
    findTeamName= re.compile('"team-name".*>(.*) find_it = re.findall(findTeamName,webpage)
    print find_it
    [' ']


    I've checked tutorials and the pages they scrape work fine when I do them. So what's up with this?
    I've never used urllib before. I downloaded beautfiful soup 4 from the get-go. If you change your code to that (since I have no idea how urlopen works or how compile or findall work), I'll be glad to take a look.

    Also, try attaching your code in a .txt file

  9. #9
    HUY
    HUY's Avatar Become A Pro!
    Join Date: 04-29-09
    Posts: 253
    Betpoints: 3257

    Quote Originally Posted by illfuuptn View Post
    Idk why it is returning this: [' '] but it is and it makes no sense.


    webpage=urlopen('http://www.sportsbookreview.com/betting-odds/ncaa-basketball/?date=20120102').read()
    findTeamName= re.compile('"team-name".*>(.*) find_it = re.findall(findTeamName,webpage)
    print find_it
    [' ']


    I've checked tutorials and the pages they scrape work fine when I do them. So what's up with this?
    Never parse HTML using hand-crafted regular expressions. You will go insane and the result will always be unreliable. Just use BeautifulSoup and save yourself the hassle.

  10. #10
    KennyRogers
    KennyRogers's Avatar Become A Pro!
    Join Date: 12-20-12
    Posts: 1
    Betpoints: 12

    Is there an update to this? I'm not familiar with BeautifulSoup, I am more familiar with Scrapy.

  11. #11
    Spektre
    Spektre's Avatar Become A Pro!
    Join Date: 02-28-10
    Posts: 184
    Betpoints: 1250

    Is scraping SBR allowed?

Top