Excel based tutorial for web scraping
Collapse
X
-
oilcountry99SBR Wise Guy
- 08-29-10
- 707
#176Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#177This is my current copy selection
Worksheets("Test").Range("A2299").Copy _
this is a way i found to get to a cell and clear it.
Dim cell As Range
For Each cell In [a:a]
If cell.Value = "SPORTS" Then cell.ClearContents 'put any value you want here
Next cell
What i would like to do is use them both together, to either delete everything after cell.value or use cell.value in my copy instead of p99
have having a hard time at combining everythingComment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#178Instead of copying I just deleted un wanted stuff and formatted on the main page. At the end of my code it renames the sheet.
So I guess you could say I figured it out.Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#179Text Splitting...
Pittsburgh Pirates at Chicago Cubs
Milwaukee Brewers at Philadelphia Phillies
New York Mets at Atlanta Braves
That's a1 trough a3
I would like to have the home teams in r1 through r3 and visiting teams in s1 through s3.
Matchups will not always be in the same order.
It seems that for some teams what I tried works, but with others it doesn't.Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#180New York Mets at Atlanta Braves =RIGHT(A59,LEN(A59)-(1)-FIND("at",A59))
that works to get Atlanta
Pittsburgh Pirates at Chicago Cubs =RIGHT(A64,LEN(A64)-(1)-FIND("at",A64))
same formula doesn't work to get ChicagoResult= es at Chicago Cubs Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#181Got it =TRIM(RIGHT(SUBSTITUTE(A64," at",REPT(" ",LEN(A64))),LEN(A64)))Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#182On the current sheet that I'm using to import my data from Web. If I type in a formula and fill down it keeps all the same result, but the actual formula updates by row like it should. I need to double click the cell then hit enter for the correct results.
Is this how excel is supposed to act? I've never imported data with Excel before but am I supposed to close the connection after? Is this a setting issue?Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#183Sub indirect()
Dim r As Range
Dim i As Long
Dim cell As Range
Set r = Range("A1:A300")
For i = r.Rows.Count To 1 Step -1
With r.Cells(i, 1)
If .Value = "STARTERS" Then
Range("A" & i - 2).Range("$r$1").FormulaR1C1 = _
"=TRIM(RIGHT(SUBSTITUTE(RC[-17],"" at"",REPT("" "",LEN(RC[-17]))),LEN(RC[-17])))"
End If
End With
Next i
End Sub
This will search a1 through a300 and look for they key word "starters". Once it finds starters it will move up two rows and split the matchup and place the home team in column r
table i imported is http://espn.go.com/mlb/probablesComment -
HeeeHAWWWWSBR Hall of Famer
- 06-13-08
- 5487
#184Good thread this - wish it was available a few years ago, would have saved me an enormous amount of time :-)Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
-
HeeeHAWWWWSBR Hall of Famer
- 06-13-08
- 5487
#186
Haven't done much of this lately (ended up paying someone to write a scraper!), but if I'm reading you correctly that sounds like a job for a vlookup.Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#187
I was able to do it with INDEX. Looks like there may be 2 ways!
=IFERROR(INDEX(RPI!$C$3:$C$32,MATCH($R4, RPI!$B$3:$B$32,0)),"")Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#188I'm looking to import pitcher stats that include WHIP
I'm currently using http://espn.go.com/mlb/probables
But there is no walks, so I cannot use a formula to figure out WHIP stats.
Any suggestions of a better site than espn?
Or a way to figure out whip?Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#189a4u2fear
im looking to import 5 different web pages with a "loop" doing the exact same formatting, then renaming the tab to a specific name.
as of now, after the standard pull in code i have the following.
'Deletes Test SheetApplication.DisplayAlerts = False
Sheets("Test").Select
ActiveWindow.SelectedSheets.Delete
'Add Titles
Sheets("MR").Select
Range("H9").Select
ActiveCell.FormulaR1C1 = "Us Odds"
Range("I9").Select
ActiveCell.FormulaR1C1 = "Decimal Odds"
Range("I10").Select
Columns("I:I").EntireColumn.AutoFit
Range("J9").Select
ActiveCell.FormulaR1C1 = "Win/Loss"
Range("J10").Select
Columns("J:J").EntireColumn.AutoFit
'autofits all rows
Sheets("MR").Select
Columns("A:I").Select
Columns("A:I").EntireColumn.AutoFit
Range("A3").Select
I would like to have something along the line of
for i = 1 to 5
1= site 1
2= site 2
With Sheets("NHLData").QueryTables.Add(Connec tion:= _
"URL;(1), Destination:= _
Range("$A$1"))
'Deletes Test SheetApplication.DisplayAlerts = False
Sheets("Test").Select
ActiveWindow.SelectedSheets.Delete
'Add Titles
Sheets("MR").Select
Range("H9").Select
ActiveCell.FormulaR1C1 = "Us Odds"
Range("I9").Select
ActiveCell.FormulaR1C1 = "Decimal Odds"
Range("I10").Select
Columns("I:I").EntireColumn.AutoFit
Range("J9").Select
ActiveCell.FormulaR1C1 = "Win/Loss"
Range("J10").Select
Columns("J:J").EntireColumn.AutoFit
'autofits all rows
Sheets("MR").Select
Columns("A:I").Select
Columns("A:I").EntireColumn.AutoFit
Range("A3").Select
next i
I know im way off, i just need a direction to go in. Do not do the work for me lolComment -
oilcountry99SBR Wise Guy
- 08-29-10
- 707
#190I'm looking to format the following:
Cell A1 contains: Jake Peavy (R)
I want cell B1 to contain: J. Peavy
I know this calls for right left replace trim etc etc commands but I don't get how that works for the life of me.
Any help here would be great
TIA
OC99Comment -
oilcountry99SBR Wise Guy
- 08-29-10
- 707
#191I'm looking to import pitcher stats that include WHIP
I'm currently using http://espn.go.com/mlb/probables
But there is no walks, so I cannot use a formula to figure out WHIP stats.
Any suggestions of a better site than espn?
Or a way to figure out whip?Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#192Oil,
give this a try
=LEFT(A1,1) & ". " & B1 & MID(A1,SEARCH(" ", A1,1)+1,(SEARCH(" ",A1,SEARCH(" ",A1,1)+1)-1)-SEARCH(" ",A1,1))Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#193To be a bit more clear on what i want to do is to get data from 5 different pages (all with different names) with a loop to avoid having 5 different set of info. Just a bit more simpler code... and alternate method.
As of now I have two modules, module 2 has -one sub that gets site 1, I copied that down and have a another sub to get site 2 and so on..
Then in module 1 i have my main code, which is the one I run.
sub main
call site 1
call site 2
call site 3
...
end subComment -
akphideltSBR MVP
- 07-24-11
- 1228
#194Awesome thread guys. Hope I can help. Expert in excel. Also produced web programs, was on another site with an NBA tracker. Have it on www.akptracker.com. Can scrape any data off any site.
B_rad, I promise you if you took the time to do this through php and MySQL you would never use excel again. And it's completely free to do. Just download Xampp from apachefriends.org. Either way, hope I can help. Love this stuff.Comment -
oilcountry99SBR Wise Guy
- 08-29-10
- 707
#195
I appreciate your efforts and any other suggestions you may haveComment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#196I had an error! copy and paste this in b1.
=LEFT(A1,1) & ". " & MID(A1,SEARCH(" ", A1,1)+1,(SEARCH(" ",A1,SEARCH(" ",A1,1)+1)-1)-SEARCH(" ",A1,1))Comment -
oilcountry99SBR Wise Guy
- 08-29-10
- 707
#197Works great! My next question is what if the name is Jorge De La Rosa (L)?
I end up with J. DeComment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#198You will need to play around with that one. Do a Google search for excel text splitting from space to bracket.
I was able to overcome my problem by using
Dim URL As Range
For Each URL In Sheets("Urls").Range("A1").Cells 'Sheet Urls is where I have the different urls to add on to the site.
ActiveWorkbook.Worksheets.Add
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;www.firstpartofsite.com/" & URL, Destination:=Range("$A$1"))
.Name = "table"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = False
.RefreshPeriod = 0
.WebSelectionType = "1"
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = True
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Next
End SubComment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#199a12 = Jorge De La Rosa (L)
=LEFT(A12,1) & ". " & MID(A12,SEARCH(" ", A12,1)+1,(SEARCH(" (",A12,SEARCH(" ",A12,1)+1)-1)-SEARCH(" ",A12,1))
b12 = J. De La RosaComment -
oilcountry99SBR Wise Guy
- 08-29-10
- 707
-
akphideltSBR MVP
- 07-24-11
- 1228
#201I'm looking to import pitcher stats that include WHIP
I'm currently using http://espn.go.com/mlb/probables
But there is no walks, so I cannot use a formula to figure out WHIP stats.
Any suggestions of a better site than espn?
Or a way to figure out whip?
ESPN is the place for MLB stats! Discover the All MLB Pitching stat leaders of the 2025 MLB Regular Season.
You can loop through the "count" and store all the pitcher data in a sheet, then just look up that data from the Probables page you have by pitcher?
You'd just have to adjust the pitcher count and the qualified/false or qualified/true parameters and loop through that beast and you should have all the pitching data you can handle.Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#202What about getting the statistics for the pitchers from ESPN stats pages... like this...
ESPN is the place for MLB stats! Discover the All MLB Pitching stat leaders of the 2025 MLB Regular Season.
You can loop through the "count" and store all the pitcher data in a sheet, then just look up that data from the Probables page you have by pitcher?
You'd just have to adjust the pitcher count and the qualified/false or qualified/true parameters and loop through that beast and you should have all the pitching data you can handle.
On the probables page it all there and easy to get. The pitching link you posted, its over many of pages. I'm not advanced enough to take on this task.
I would love to make a sheet that has every pitcher in mlb and all thier stats.Comment -
akphideltSBR MVP
- 07-24-11
- 1228
#203I'll take a stab at it after lunch... might be a little rusty on VBA but it should come back.Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#204Rusty or not. Point us in the direction and we should be able to help out!Comment -
akphideltSBR MVP
- 07-24-11
- 1228
#205
Code:Sub GetPitchingStats() Dim startPitcher As Long Dim lastPitcher As Long Dim qualified As String Dim lastRow As Long startPitcher = 1 lastPitcher = 800 qualified = "false" lastRow = 1 Application.ScreenUpdating = False Do Until startPitcher > lastPitcher With ActiveSheet.QueryTables.Add(Connection:= _ "URL;http://espn.go.com/mlb/stats/pitching/_/count/" & startPitcher & "/qualified/" & qualified & "/order/false" _ , Destination:=Range("$A$" & lastRow)) .Name = "false" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlEntirePage .WebFormatting = xlWebFormattingNone .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With 'delete all blank cells in column R Columns("R:R").Select Selection.SpecialCells(xlCellTypeBlanks).Select Selection.EntireRow.Delete 'start back at cell B1 Range("B1").Select 'add one to the last row so it starts on a blank row next time lastRow = Selection.End(xlDown).Row + 1 'increase the pitcher count by 40 startPitcher = startPitcher + 40 Loop 'a little cleanup lastRow = Range("$B$5000").End(xlUp).Row For i = 2 To lastRow If Cells(i, 2) = "PLAYER" Then Rows(i & ":" & i).EntireRow.Delete End If Next i Application.ScreenUpdating = True End Sub
Comment -
b_rad_1983SBR High Roller
- 01-07-13
- 127
#206This is super ghetto but it gets the job done on my end...
Code:Sub GetPitchingStats() Dim startPitcher As Long Dim lastPitcher As Long Dim qualified As String Dim lastRow As Long startPitcher = 1 lastPitcher = 800 qualified = "false" lastRow = 1 Application.ScreenUpdating = False Do Until startPitcher > lastPitcher With ActiveSheet.QueryTables.Add(Connection:= _ "URL;http://espn.go.com/mlb/stats/pitching/_/count/" & startPitcher & "/qualified/" & qualified & "/order/false" _ , Destination:=Range("$A$" & lastRow)) .Name = "false" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlEntirePage .WebFormatting = xlWebFormattingNone .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With 'delete all blank cells in column R Columns("R:R").Select Selection.SpecialCells(xlCellTypeBlanks).Select Selection.EntireRow.Delete 'start back at cell B1 Range("B1").Select 'add one to the last row so it starts on a blank row next time lastRow = Selection.End(xlDown).Row + 1 'increase the pitcher count by 40 startPitcher = startPitcher + 40 Loop 'a little cleanup lastRow = Range("$B$5000").End(xlUp).Row For i = 2 To lastRow If Cells(i, 2) = "PLAYER" Then Rows(i & ":" & i).EntireRow.Delete End If Next i Application.ScreenUpdating = True End Sub
I had to add Dim i As Integer, then it worked great!
Thanks!Comment -
akphideltSBR MVP
- 07-24-11
- 1228
-
b_rad_1983SBR High Roller
- 01-07-13
- 127
#208In the time it took you to do that, how long would it take to accomplish it in php?Comment -
lamichaeljamesSBR Rookie
- 06-02-14
- 40
#209Good thread!Comment -
akphideltSBR MVP
- 07-24-11
- 1228
#210
The beauty of it though, you could scrape every games data, store it and calculate that data yourself if you wanted to.Comment
SBR Contests
Collapse
Top-Rated US Sportsbooks
Collapse
#1 BetMGM
4.8/5 BetMGM Bonus Code
#2 FanDuel
4.8/5 FanDuel Promo Code
#3 Caesars
4.8/5 Caesars Promo Code
#4 DraftKings
4.7/5 DraftKings Promo Code
#5 Fanatics
#6 bet365
4.7/5 bet365 Bonus Code
#7 Hard Rock
4.1/5 Hard Rock Bet Promo Code
#8 BetRivers
4.1/5 BetRivers Bonus Code