Login Search

Excel based tutorial for web scraping

Last Post
#106

Default

Yes I think so b_rad, I can pull the data for the NFL season, box scores and line & total for each game, but I have only tried it on 20+ games so far while I work out all the small problems like OT and pick-em games.

I will post the code later if you want to run it yourself and have a look.

I had a similar problem as you with the data beginning further down the sheet because I had put some formulas to the side of the data and the code will only run from the first unused line. I eventually solved it after several attempts at deleting all the data on sheet1 and retyping the headers.
#107

Default

I'm currently working on the NHL example from the beginning posts. I plan on adding a few things
-Message box
-Headers
-Progress bar

Message box will ask for the year and update the code,
The headers are automatically placed in "sheet1" followed by the data
Progress bar will give a time on how long it will be <- this is tough
#108

Default

Heres what i think is a default macro for the status bar
I'm thinking of using gamenumber logic to figure out the percentage

kinda stumped... I will keep working on it.



Sub
ShowProgressInLoop()
' Show the progress in a loop

Dim rngCel As Range
Dim lCounter AsLong
Dim lTotal AsLong
lTotal = Selection.Cells.Count ' total amount of cells to walk through

' Makes sure that the statusbar is visible.
Application.DisplayStatusBar = True

ForEach rngCel In Selection
' show the progress in the statusbar:
Application.StatusBar = "Processing cell " & rngCel.AddressLocal & _
" " & Format((lCounter / lTotal), "0%")

' do something
' -* Add your code here *-

' update the counter
lCounter = lCounter + 1
Next rngCel


' When your code is finished, reset the statusbar:
Application.StatusBar = False
EndSub
#115

Default

Quote Originally Posted by b_rad_1983 View Post
that error happens because they cannot find the games. I think because you're trying to search 13, 13 isn't done so its not going to search for the full 1250.
I found the same type of error with 11, it happens twice.
yeah I assumed thats why, but I get the same error on 11,12,13. I'll work on it again in a few hours. Thanks for the replies!
#116

Default

I have a question for you guys....first of all I apologize for putting this here but I figure you can help. I'm looking for a way to find tomorrows date in column A and delete that row and the 100 rows below it in excel 2010 vba.

The Date example I am looking for is this 3/28/2014 10:10:00 PM (the 10:10:00 PM is here is the game time which is irrelevant to my search) I want to find just the date portion

I would like to use something like today + 1 or something along those lines to specify the date.

I'm trying to pull odds and I get today's and tomorrow's when I do the web query and I'd like to delete tomorrows entrys.

Awesome thread here guys!!!
#117

Default

shovde I ran this code and it pulls ten games. I will run all the games next.

Public Sub MainCode()
'This is a comment. This sub will have all of our
'main code and will call other subs


'Below are my variables
Dim totalgames As Integer 'i.e. 1230
Dim NHLyear As Integer 'i.e. 12, two digit because the year in NHL's website is only two digits
Dim gamenumber As Integer


Application.ScreenUpdating = False


For NHLyear = 13 To 13


If NHLyear = 12 Then
totalgames = 720 'shortened season!
Else
totalgames = 1230 '2011 and 2013 both were full seasons
End If


For gamenumber = 1 To 10
Call ImportWebpage(gamenumber, NHLyear)
Call PullData
Next


Next


Application.ScreenUpdating = True




End Sub




Public Sub ImportWebpage(game, gameyear)
'this sub only imports the webpage
Dim gamestring As String


Application.DisplayAlerts = False
Sheets("Sheet2").Delete
Application.DisplayAlerts = True
Sheets.Add After:=Sheets(Sheets.Count)
Sheets(Sheets.Count).Name = "Sheet2"


If game < 10 Then
gamestring = "000" & game
ElseIf game < 100 Then
gamestring = "00" & game
ElseIf game < 1000 Then
gamestring = "0" & game
Else
gamestring = game
End If


With Sheets("Sheet2").QueryTables.Add(Connection:= _
"URL;http://www.nhl.com/ice/boxscore.htm?id=20" & gameyear & "02" & gamestring, Destination:= _
Range("$A$1"))
.Name = ""
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With


End Sub


Public Sub PullData()
'this sub will pull data from the imported webpage and
'put into a main sheet which will contain all game data
Dim lastrowsheetone As Integer
Dim findmatch As Integer


lastrowsheetone = Sheets("Sheet1").UsedRange.Rows.Count
findmatch = Application.Match("1st", Sheets("Sheet2").Range("B1:B100"), 0)


'away team stats
Sheets("Sheet1").Cells(lastrowsheetone + 1, 2) = Sheets("Sheet2").Cells(findmatch + 1, 1)
Sheets("Sheet1").Cells(lastrowsheetone + 1, 3) = Sheets("Sheet2").Cells(findmatch + 1, 2)
Sheets("Sheet1").Cells(lastrowsheetone + 1, 4) = Sheets("Sheet2").Cells(findmatch + 1, 3)
Sheets("Sheet1").Cells(lastrowsheetone + 1, 5) = Sheets("Sheet2").Cells(findmatch + 1, 4)
If Sheets("Sheet2").Cells(findmatch + 1, 5) = "" Then
Sheets("Sheet1").Cells(lastrowsheetone + 1, 6) = 0
Else
Sheets("Sheet1").Cells(lastrowsheetone + 1, 6) = Sheets("Sheet2").Cells(findmatch + 1, 5)
End If
Sheets("Sheet1").Cells(lastrowsheetone + 1, 7) = Sheets("Sheet2").Cells(findmatch + 1, 7)


'home team stats
Sheets("Sheet1").Cells(lastrowsheetone + 1, 8) = Sheets("Sheet2").Cells(findmatch + 2, 1)
Sheets("Sheet1").Cells(lastrowsheetone + 1, 9) = Sheets("Sheet2").Cells(findmatch + 2, 2)
Sheets("Sheet1").Cells(lastrowsheetone + 1, 10) = Sheets("Sheet2").Cells(findmatch + 2, 3)
Sheets("Sheet1").Cells(lastrowsheetone + 1, 11) = Sheets("Sheet2").Cells(findmatch + 2, 4)
If Sheets("Sheet2").Cells(findmatch + 2, 5) = "" Then
Sheets("Sheet1").Cells(lastrowsheetone + 1, 12) = 0
Else
Sheets("Sheet1").Cells(lastrowsheetone + 1, 12) = Sheets("Sheet2").Cells(findmatch + 2, 5)
End If
Sheets("Sheet1").Cells(lastrowsheetone + 1, 13) = Sheets("Sheet2").Cells(findmatch + 2, 7)






End Sub
#118

Default

Quote Originally Posted by oilcountry99 View Post
I have a question for you guys....first of all I apologize for putting this here but I figure you can help. I'm looking for a way to find tomorrows date in column A and delete that row and the 100 rows below it in excel 2010 vba.

The Date example I am looking for is this 3/28/2014 10:10:00 PM (the 10:10:00 PM is here is the game time which is irrelevant to my search) I want to find just the date portion

I would like to use something like today + 1 or something along those lines to specify the date.

I'm trying to pull odds and I get today's and tomorrow's when I do the web query and I'd like to delete tomorrows entrys.

Awesome thread here guys!!!
oil, I am on travel so I can't post full solution, but you need to do an "instr()" command. Look at prior posts for the code. It searches one string for another, ie your date cell will be searched for " " or space. So you know where the space is you know your date will be prior to the space. The instr command will return an integer of the position of the space. Ie 10 for your above sample. So you can then use the "mid(" command by doing mid(yourcell, 1, 9) where yourcell is the string or cell that had the date data, and the mid will return the string from chars 1 to 9 or in your example, 3/28/2014. Be sure to use a variable instead of 9 in the mid as it will not always be 9 characters when the month is two digits or the day is a single digit
#120

Default

Quote Originally Posted by a4u2fear View Post
oil, I am on travel so I can't post full solution, but you need to do an "instr()" command. Look at prior posts for the code. It searches one string for another, ie your date cell will be searched for " " or space. So you know where the space is you know your date will be prior to the space. The instr command will return an integer of the position of the space. Ie 10 for your above sample. So you can then use the "mid(" command by doing mid(yourcell, 1, 9) where yourcell is the string or cell that had the date data, and the mid will return the string from chars 1 to 9 or in your example, 3/28/2014. Be sure to use a variable instead of 9 in the mid as it will not always be 9 characters when the month is two digits or the day is a single digit
Should be something like

Celldate=mid(yourcell, 1, instr(...)-1)

instr is position of space so -1 will return data before it