1. #106
    pedro803
    pedro803's Avatar Become A Pro!
    Join Date: 01-02-10
    Posts: 309
    Betpoints: 5708

    Quote Originally Posted by uva3021 View Post
    stick a comment,', before the "On Error Resume Next" sequence, then post the error message, if any

    it could be merely statfox being offline, or a bad internet connection

    I get:

    Run-time error '9':
    Subscript out of range



    Thanks for the help -- and again thanks for the heads up that excel can scrape web pages!

  2. #107
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    click debug and tell me what line it highlights

  3. #108
    pedro803
    pedro803's Avatar Become A Pro!
    Join Date: 01-02-10
    Posts: 309
    Betpoints: 5708

    I clicked debug and step into in the VB code window and it highlighted the very first line:

    Sub NFLfromStatfox()

    but I kinda don't think that is what you were looking for, I don't know how to use debug

  4. #109
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    did you define this range?

    Code:
    For i = 1 To Range("NFLteams").Rows.Count

  5. #110
    pedro803
    pedro803's Avatar Become A Pro!
    Join Date: 01-02-10
    Posts: 309
    Betpoints: 5708

    well that line is in the code that you provided, so I guess I did

    in your instructions you wrote:

    Then select all the teams in the table and define a name to the range, a brief survey of the code and one can see I named the range "NFLTeams"

    I wasn't exactly sure what to do, I highlighted the column of teams and pushed the button at the top "define name" on the formulas tab and named it NFLteams

  6. #111
    pedro803
    pedro803's Avatar Become A Pro!
    Join Date: 01-02-10
    Posts: 309
    Betpoints: 5708

    stepping through the code when it gets to the line:

    Sheets(sht).Select

    i get the error message

    run time error '1004'

    application-defined or object defined-error

  7. #112
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    that's because "sht" doesn't exist as a sheet. There is something wrong with your naming conventions

    I.E. this is how my range, "NFLTeams" is structured

    Arizona2009
    Atlanta2009
    ....
    NY+JETS2004
    ....
    SAN+DIEGO2000

    Every team from 2009 to 2000 is named in accordance to how they are formatted in the statfox link

    Copy the names from a team report page, replace all spaces with a "+", the run the code

  8. #113
    pedro803
    pedro803's Avatar Become A Pro!
    Join Date: 01-02-10
    Posts: 309
    Betpoints: 5708

    I am giving up for the night -- I have tried everything I can think of for now. I did import the table from the destination page with the excel browser (I could only import the whole page, wasn't able to get the table separate) and I have done the find and replace -- and I have done my best to name the range but I am not sure I am doing this right. I get the names of all the sheets e.g. NY+Giants2004 but none of the sheets have anything in them.

    thanks for all of your help, I will come back to this!

  9. #114
    thechaoz
    2019 SBRs Toughest Poster
    thechaoz's Avatar Become A Pro!
    Join Date: 10-23-09
    Posts: 12,155
    Betpoints: 35902

    Amazing thread. Thanks for all the great info.

  10. #115
    ScoreProphet
    ScoreProphet's Avatar Become A Pro!
    Join Date: 09-01-10
    Posts: 11

    Hi everyone, new guy here..

    I didn't read this entire thread, but I got the gist of it from the first few pages. I started doing handicapping a few years ago as a hobby. I started by doing calculations on spreadsheets. Since then I've moved on to running all out, full-blown simulations of football games with some Python scripts I wrote. I run each matchup 10,000 times, and it gives me each team's % chance of winning straight up or against a given spread, average scores, rush attempts and yards and pass attempts, completions and yards. I also use the scripts to rank the teams, and I like my rankings better than most of the ones used in the BCS.

    I'm not here to gloat. I can't, actually, because I haven't yet used my results to gamble with. I also don't have hard numbers on exactly how successful my projections are, though I will in the coming days I hope. All I know for sure is that I consistently do very well on ESPN.com's college pick'em, as that was my initial reason for starting all of this. That said, I'm only here to answer any questions anyone has about my methods, my scripts, or whatever else you can think of.

    A little more detail:
    I've built a database of each and every college football play for the last 2 years (and I can go back further just by running a script). For every play, the database has the down & distance, the yardline, the quarter and time left, the current score, the type of play, yards gained or lost, turnover, penalty... the whole shebang. With this information I can build my own boxscores with almost any type of information I need. More importantly, I use the info to build a sort of profile of each team, with their individual offensive, defensive, and special teams strengths and weaknesses.

    These team profiles consist of a series of ratings which, when compared to any given opponents ratings, can be fed to the simulation script which churns out 10,000 simulated games between the given teams.

    That's the basics... if you have any questions or would like any tips, ask away.

  11. #116
    ScoreProphet
    ScoreProphet's Avatar Become A Pro!
    Join Date: 09-01-10
    Posts: 11

    First post took a while for moderator approval, and then went up twice. Sorry!
    Last edited by ScoreProphet; 09-01-10 at 04:13 PM. Reason: double post

  12. #117
    Indecent
    Indecent's Avatar Become A Pro!
    Join Date: 09-08-09
    Posts: 758
    Betpoints: 1156

    Quote Originally Posted by ScoreProphet View Post
    More importantly, I use the info to build a sort of profile of each team, with their individual offensive, defensive, and special teams strengths and weaknesses.

    These team profiles consist of a series of ratings which, when compared to any given opponents ratings, can be fed to the simulation script which churns out 10,000 simulated games between the given teams.

    That's the basics... if you have any questions or would like any tips, ask away.
    Sounds interesting. The bolded part reminds me a bit of opponent modeling in poker-botting. If you don't mind me asking, what type of information/stats do you use for the team profiles? If your information is detailed enough, I think you could create a pretty robust play-by-play prediction system.

  13. #118
    ScoreProphet
    ScoreProphet's Avatar Become A Pro!
    Join Date: 09-01-10
    Posts: 11

    Quote Originally Posted by Indecent View Post
    If you don't mind me asking, what type of information/stats do you use for the team profiles?
    Without going into mind-numbing detail, each team is rated on their run "power" and pass "power", and also such things as their punt/kick return ability and whatnot. The rating system itself is on a scale centered around 1.0, which would be an average rating. On offense, anything over 1.0 is good, and on defense less than 1.0 is good. Let's say that across all FBS teams, the average running play is 5 yards per carry. If Team A has a run rating of 1.1, then that team should average about 5 * 1.1 = 5.5 yards per carry against an average (1.0 run defense) team. If instead of an average team, Team A played against a team with a run defense rating of 0.8 (very good), then Team A will probably average about 5 * 1.1 * 0.8 = 4.4 yards per rush.

    The simulation script itself will randomize Team A's carries throughout the games in a way that after 10,000 games they will average the "correct" yards per play. It does the same for each pass completion (and similarly completion %), and also accounts for turnovers and penalties, and like I mentioned kick/punt returns. Each team's profile also has info regarding their pass/rush ratio, which is also accounted for in the simulation.

    Simulated coaching decisions, such as passing more often when you're trailing toward the end of the game (or running more with a large lead), are also taken into account to provide more realistic results.

  14. #119
    CrimsonQueen
    CrimsonQueen's Avatar Become A Pro!
    Join Date: 08-12-09
    Posts: 1,068
    Betpoints: 1660

    ScoreProfit: How do you then back test this? I have somewhat of a stats database, and some formulas I've created similar to your rating each thing based around 1.0... I have limited knowledge of Python, but really want to back test my data with my formulas to find the final scores vs. the actual scores and spreads.
    Currently... I made it so I have a drop down box with each team, then it pulls all their stats into the fields for my formulas to read (using an Array formula in Excel)...but it's insanely time consuming (and outright laughable, really) to switch every single matchup and look at every single score and compare them all by hand... then change the formula slightly to make it more accurate and then redo all of this by hand again.........
    Anyone who wants to help, thanks!

  15. #120
    ScoreProphet
    ScoreProphet's Avatar Become A Pro!
    Join Date: 09-01-10
    Posts: 11

    Quote Originally Posted by CrimsonQueen View Post
    ScoreProfit: How do you then back test this? I have somewhat of a stats database, and some formulas I've created similar to your rating each thing based around 1.0... I have limited knowledge of Python, but really want to back test my data with my formulas to find the final scores vs. the actual scores and spreads.
    Currently... I made it so I have a drop down box with each team, then it pulls all their stats into the fields for my formulas to read (using an Array formula in Excel)...but it's insanely time consuming (and outright laughable, really) to switch every single matchup and look at every single score and compare them all by hand... then change the formula slightly to make it more accurate and then redo all of this by hand again.........
    Anyone who wants to help, thanks!
    Yeah, as long as you're dealing with spreadsheets, you will be doing a lot of things by hand. My knowledge of Python and programming in general is pretty limited, as well, but I know enough to get by for my needs. You could use python to work with CSV files, but storing all of your data in a database is much neater (SQLite works great). If you're going to get more into Python, look into SQLalchemy... it makes working with databases with python pretty easy. It's one more thing to learn, but it beats having to write SQL queries all the time, and then figuring out how to deal with the lists, and then this and that. It handles a lot of the legwork and confusing bits for you.

    As for your situation, I haven't dealt with spreadsheets in a while, so it's hard for me to say how you should backtest your results, especially without knowing the details of how you have all your data laid out. It sounds to me, though, that instead of using the dropdown lists, you should find a way to import the week's results onto one sheet. Just a long list, with each row containing one game. I would imagine columns A&B having the home and away team names, C&D your predicted scores, and E&F the actual scores. This way it's easy to put formulas in G&H for the difference between your projections and the actual scores, or whatever other calculations you want to see.

  16. #121
    craigpb
    craigpb's Avatar SBR PRO
    Join Date: 06-19-08
    Posts: 657
    Betpoints: 7298

    Thanks for all the great info guys; really helpful.

  17. #122
    hubie69
    I am JJs bookie
    hubie69's Avatar Become A Pro!
    Join Date: 09-16-10
    Posts: 7,329
    Betpoints: 617

    I use a mysql db with a series of bash scripts on a linux box for my college basketball stuff. Was a F*ck ton of work at the beggining to get it going but now that it's running it doesn't require much from me. only does college basketball though.

  18. #123
    nmr123321
    nmr123321's Avatar Become A Pro!
    Join Date: 01-06-10
    Posts: 609
    Betpoints: 25

    thank you ver much for this

  19. #124
    dmolition
    dmolition's Avatar Become A Pro!
    Join Date: 10-10-08
    Posts: 106
    Betpoints: 228

    This is thread is really great, thanks a lot, i have some questions about using
    br.set_handle_robots(False) in mechanize
    when a site has a robots.txt file, i know there are legal o ethical issues respecting this,
    i want to try scraping but from what i read you need to set timeouts on your scripts so your ip doesnt get ban, and other measures.

    are there any sites that are "ok" with being scrape for stats (sbr?)?? or should you be really careful with your scraping since most i would guess dont like it, what other things should we consider??

  20. #125
    Wrecktangle
    Wrecktangle's Avatar Become A Pro!
    Join Date: 03-01-09
    Posts: 1,524
    Betpoints: 3209

    dmolition, most sites are NOT OK with scraping due to copyright and not a few will actively block you. And it seems that even those who tolerate it change formats so often that you are always in tweaking code to get around the changes.

  21. #126
    Maverick22
    Maverick22's Avatar Become A Pro!
    Join Date: 04-10-10
    Posts: 807
    Betpoints: 58

    Which Sites are you referring to that will block you?

  22. #127
    lucaario83
    lucaario83's Avatar Become A Pro!
    Join Date: 10-05-10
    Posts: 180

    very interesting stuff

  23. #128
    dmolition
    dmolition's Avatar Become A Pro!
    Join Date: 10-10-08
    Posts: 106
    Betpoints: 228

    Quote Originally Posted by Wrecktangle View Post
    dmolition, most sites are NOT OK with scraping due to copyright and not a few will actively block you. And it seems that even those who tolerate it change formats so often that you are always in tweaking code to get around the changes.
    Yeah i figured as much, so to scrape 10 seasons of any sport i imagine i need multiple IPs, timeouts in scripts, constantly checking for changes in DOM structure of the HTML,etc,etc. Now i know the cost of data.

    Im gonna research and maybe if i gather enough data i'll be willing to trade it (after i validate it of course)
    It would be nice to have a list of sites of where they enforce more strictly anti scraping policies or where NOT to try it so we can have a little piece of mind.

    Also i'm taking the hard road and learning R and python (checking out SciPy also) for data analysis, i'm savvy with software development, when i can actually start doing some serious data analysis if anyone wants to exchange technical tips of how to do that and this, maybe we can open a "hacking/data analysis stuff" thread to discuss tips and such, to ask general questions,tips and contribute in general.

  24. #129
    uva3021
    uva3021's Avatar Become A Pro!
    Join Date: 03-01-07
    Posts: 537
    Betpoints: 381

    i abuse statfox and have yet to be banned

  25. #130
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    Quote Originally Posted by Wrecktangle View Post
    dmolition, most sites are NOT OK with scraping due to copyright and not a few will actively block you.
    Hm, most sites? No way. The largest projects traffic wise are collecting several years worth of box scores and play-by-plays. Everything else is peanuts. The only site that ever temporarily blocked my scrapping was !Yahoo.

  26. #131
    dmolition
    dmolition's Avatar Become A Pro!
    Join Date: 10-10-08
    Posts: 106
    Betpoints: 228

    Ok im starting to collect the data so far so good, but the next step is to check the integrity,
    im comparing my data against covers.com and espn.com mostly,
    are these sites accurate with stat records?

    what sites are more reliable in your opinion for stat comparing?

  27. #132
    jscar3
    jscar3's Avatar Become A Pro!
    Join Date: 02-10-09
    Posts: 130

    i will look this up to see the sense in it. thanks.

  28. #133
    LegitBet
    steelers
    LegitBet's Avatar Become A Pro!
    Join Date: 05-25-10
    Posts: 538

    what would be nice is 'data for dummies', but that comes with many challenges for the sharpies...
    my 2 cents

  29. #134
    Jeremy Nguyen
    Jeremy Nguyen's Avatar Become A Pro!
    Join Date: 10-25-10
    Posts: 1

    last monday nite 10/18

    Hello to every One
    Do anybody remember what time ? the ball kick off from 2nd between Tenn adn Jacksonvill? Please!! Thank you

  30. #135
    Chachieguy
    Chachieguy's Avatar Become A Pro!
    Join Date: 10-27-10
    Posts: 3
    Betpoints: 11

    Looking forward to learning more. Thank you

  31. #136
    Flying Dutchman
    Floggings continue until morale improves
    Flying Dutchman's Avatar Become A Pro!
    Join Date: 05-17-09
    Posts: 2,467
    Betpoints: 759

    Quote Originally Posted by Data View Post
    Hm, most sites? No way. The largest projects traffic wise are collecting several years worth of box scores and play-by-plays. Everything else is peanuts. The only site that ever temporarily blocked my scrapping was !Yahoo.
    I'm hearing both sides on this. I had trouble with Covers last year in the NBA and then went to a piece of software where I could change my IP and problem went away. Then I quit changing IP for a while and didn't get blocked. Or were they just having site trouble?

    I also had trouble on FoxSports and CBS as I recall.

  32. #137
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    While scraping boxscores, I make a courtesy 1 sec pause after processing each boxscore. Not sure if everybody does this but they should.

  33. #138
    Indecent
    Indecent's Avatar Become A Pro!
    Join Date: 09-08-09
    Posts: 758
    Betpoints: 1156

    Quote Originally Posted by Data View Post
    While scraping boxscores, I make a courtesy 1 sec pause after processing each boxscore. Not sure if everybody does this but they should.
    If you don't mind me asking, how long have you been using this method?

    I have my scraper pause for a random number of seconds (usually 10-25 but it will go shorter and longer) to try to simulate a human browsing the pages. If you've been using 1 second delay successfully for a while with no bans, etc, I might have to drop my delay times considerably.

  34. #139
    Data
    Data's Avatar Become A Pro!
    Join Date: 11-27-07
    Posts: 2,236

    I finished my last big scrapping project about a year ago. I only scap boxscores nowadays as I calculate all the stats I need myself. Well, I do import some stuff into Excel too but not that much.

  35. #140
    pro-style
    pro-style's Avatar Become A Pro!
    Join Date: 07-20-10
    Posts: 177
    Betpoints: 259

    where is the best play to scrape boxscores?

First 1234567 ... Last
Top