Does anyone have a good source for this? Something that scrapes nicely into excel.
Printable MLB historical box scores
Collapse
X
-
therber2Restricted User
- 12-22-08
- 3715
#1Printable MLB historical box scoresTags: None -
FlightRestricted User
- 01-28-09
- 1979
#2Probably better to build an automated tool considering the vast number of games each season.
Iterative HTML crawler + RegEx parserComment -
therber2Restricted User
- 12-22-08
- 3715
#4Flight
Here Flight,
Check out my file in progress:
Attached FilesComment -
FlightRestricted User
- 01-28-09
- 1979
#5
They don't have 2010 data available yet.Comment -
FlightRestricted User
- 01-28-09
- 1979
#6I looked at your spreadsheet.
Were you looking for data for this year? It looks like you're trying to track teams as they progress through the current season (as opposed to analyzing data from past seasons in order to run regressions and build a model).Comment -
therber2Restricted User
- 12-22-08
- 3715
#7
I saw that website before. Could you show me and example of how you could scrape all of the data for one team if possible?
ThanksComment -
FlightRestricted User
- 01-28-09
- 1979
#8I recommend going fully automated, but that does require programming skill. If you want to do it manually, continue reading. The reason I say go auto is because you will have to repeat this exercise every time you want to update data and get the latest numbers. I hate repeating tasks.
The official website of the Boston Red Sox with the most up-to-date information on news, tickets, schedule, stadium, roster, rumors, scores, and stats.
The team "sortable schedule" pages from mlb.com seem to do well for a manual job. Highlight the table with your mouse and CTRL+C. Go to Excel, right click and select paste special and past it as Text. If using internet explorer, you can probably just hit CTRL+V, but I find other browsers like Firefox and Chrome require a paste-special-text.
This should get you a table in excel for one team.
You can optionally delete the games that were "Postponed" to make your data clean, but it may screw up your dates. I think you may have other issues regarding date as well, as sometimes there are double headers and no-game days, and it won't line up with the Excel table you attached where you have every day listed. I would recommend leaving date out of the your table and just saying "Game 1, Game 2, etc" for your column headers.
Now... We need to clean up the scores and get columns of RS and RA, right? Insert a column to the right of the Result column C. Paste this formula:
Code:=RIGHT(C2,LEN(C2)-2))
RS formula
Code:=LEFT(D2,FIND("-",D2)-1)
Code:=RIGHT(D2,FIND("-",D2)-1)
This should get you one team's runs scored for each game in the format you need. Now repeat for RA's, and for all the other teams. I recommend hiring a high school kid, poor Yugoslavian, or a monkey to finish the job.
Hope this helps.Comment -
arwarSBR High Roller
- 07-09-09
- 208
#9brute forceComment
SBR Contests
Collapse
Top-Rated US Sportsbooks
Collapse
#1 BetMGM
4.8/5 BetMGM Bonus Code
#2 FanDuel
4.8/5 FanDuel Promo Code
#3 Caesars
4.8/5 Caesars Promo Code
#4 DraftKings
4.7/5 DraftKings Promo Code
#5 Fanatics
#6 bet365
4.7/5 bet365 Bonus Code
#7 Hard Rock
4.1/5 Hard Rock Bet Promo Code
#8 BetRivers
4.1/5 BetRivers Bonus Code