Excel based tutorial for web scraping

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • a4u2fear
    SBR Hall of Famer
    • 01-29-10
    • 8147

    #281
    B-rad, I don't know how to only import some of it, but I would rather just grab what you need from the import, or delete what's not needed after it's imported
    Comment
    • b_rad_1983
      SBR High Roller
      • 01-07-13
      • 127

      #282
      That's currently what I'm working on.
      Thanks for the reply!

      All is trying to do is pull in the odds for each game this year.
      Comment
      • Bsims
        SBR Wise Guy
        • 02-03-09
        • 827

        #283
        I've been gone for a couple of years (old man with medical problems) but decided to have another look. Stumbled onto this thread. It's great, thanks.
        Comment
        • b_rad_1983
          SBR High Roller
          • 01-07-13
          • 127

          #284
          A4, I believe that Webselection type is where you can change what tables to bring in. Just need to add a line below it that indicates which tables from a page.

          No matter what I do, covers takes forever it seems to import.

          Bsims, do you have a project on the go?
          Comment
          • Bsims
            SBR Wise Guy
            • 02-03-09
            • 827

            #285
            Originally posted by b_rad_1983
            Bsims, do you have a project on the go?
            After several years on the sidelines, I've decided to enter the analytical fray again. The first step is to review past efforts and decide which ones show enough promise to take another look at. This is non trivial, because I have over a decade and thousands of hours of work to go through.
            Comment
            • a4u2fear
              SBR Hall of Famer
              • 01-29-10
              • 8147

              #286
              Originally posted by b_rad_1983
              A4, I believe that Webselection type is where you can change what tables to bring in. Just need to add a line below it that indicates which tables from a page.

              No matter what I do, covers takes forever it seems to import.

              Bsims, do you have a project on the go?
              are you looking for new games or old years worth?
              Comment
              • a4u2fear
                SBR Hall of Famer
                • 01-29-10
                • 8147

                #287
                I use sBR for importing today's games, it's the best and I have zero issues with import time, I have never tried covers bc it never had all the information I wanted
                Comment
                • b_rad_1983
                  SBR High Roller
                  • 01-07-13
                  • 127

                  #288
                  I wanted a final score with the line from the previous days and years.
                  Comment
                  • b_rad_1983
                    SBR High Roller
                    • 01-07-13
                    • 127

                    #289
                    Importing with your standard template as shown at the start of thread Sbr for some odd reason doesn't import as is.
                    It fills all the info into column A



                    With Sheets("Temp").QueryTables.Add(Connectio n:= _
                    "URL;http://www.sportsbookreview.com/betting-odds/nhl-hockey/?date=20150101", Destination:= _
                    Range("$A$1"))
                    .Name = ""
                    .FieldNames = True
                    .RowNumbers = False
                    .FillAdjacentFormulas = False
                    .PreserveFormatting = True
                    .RefreshOnFileOpen = False
                    .BackgroundQuery = True
                    .RefreshStyle = xlInsertDeleteCells
                    .SavePassword = False
                    .SaveData = True
                    .AdjustColumnWidth = True
                    .RefreshPeriod = 0
                    .WebSelectionType = xlEntirePage
                    .WebFormatting = xlWebFormattingNone
                    .WebPreFormattedTextToColumns = True
                    .WebConsecutiveDelimitersAsOne = True
                    .WebSingleBlockTextImport = False
                    .WebDisableDateRecognition = False
                    .WebDisableRedirections = False
                    .Refresh BackgroundQuery:=False
                    End With
                    Comment
                    • a4u2fear
                      SBR Hall of Famer
                      • 01-29-10
                      • 8147

                      #290
                      Brad not exactly sure of your issue, but I think you need to use the data tab, then text to columns, and you can select the delimiter, whether it's a space, comma etc. remember, you can always turn on record macro and it will give you the code to do it automatically
                      Comment
                      • b_rad_1983
                        SBR High Roller
                        • 01-07-13
                        • 127

                        #291
                        I tried it briefly, but wouldn't cut the mustard. I tried a different site, with more work than SBR, I got it to pull in exactly how I wanted it.
                        Comment
                        • oilcountry99
                          SBR Wise Guy
                          • 08-29-10
                          • 707

                          #292
                          a4u2fear

                          I have a question regarding your original code file. What if we want to use it for the current season and we update the file everyday, or at random intervals.

                          Basically what I'd like the code to do is check for the last date or gameid or some type of "marker" we have in our data set and continue on from that point. It doesn't make sense to have to update the entire season each time we want to update the data set.

                          I hope that makes sense, any help you can provide would be appreciated!
                          Thanks
                          Comment
                          • b_rad_1983
                            SBR High Roller
                            • 01-07-13
                            • 127

                            #293
                            Looking for some help on a fairly easy code.

                            1) Range("A" & i - 3).Range("$A$1:$s$1").Copy _
                            2) Destination:=Worksheets("Covers").Range( "a1")

                            This code repeats everytime a cell has a certain value. What I would like to do, is
                            1) is the first row to copy
                            2)is where I would like to copy the data to, so its a diff sheet and the data will need to be posted on the last cell + 1
                            Comment
                            • oilcountry99
                              SBR Wise Guy
                              • 08-29-10
                              • 707

                              #294
                              I'm going to ask this again....

                              How can we get the boxscores to update for the current season starting at the last recorded game in our database? I know the NHL season hasn't started yet and I hope to get this running before that happens. I would like the macro to check the database for the last recorded boxscore and continue importing from that point. I don't want the macro to have run from game 1 of the season every day. The example provided here by the OP is great to pull in past seasons, not so efficient for the current season.

                              I was trying to insert the game id beside each game and then I could use that as a reference point to continue on but I'm not having any success. I'm no expert here but I can get my way around fairly well in terms of figuring out existing code, it's creating new stuff that I struggle with. I'm sure this concept has been used many times.

                              Please toss me a bone here....struggling excel user
                              Comment
                              • a4u2fear
                                SBR Hall of Famer
                                • 01-29-10
                                • 8147

                                #295
                                oil, good luck this season!

                                now, i'm not sure this is exactly what you need, but if i have an excel file full of data, and i want to add it on after the last row taken up:

                                lastRow = Sheets("Sheet1").UsedRange.Rows.Count

                                then add after

                                keep this in mind - if you at one point had 10 rows of data, and deleted 9 rows of values (by clicking on the data and deleting but not actually deleted the rows by clicking on the row number on the left side), it will still show lastRow=10. If this happens, just highlight the rows by clicking the numbers on the left side of excel then delete.

                                this what you needed?
                                Comment
                                • oilcountry99
                                  SBR Wise Guy
                                  • 08-29-10
                                  • 707

                                  #296
                                  a4

                                  I am using your code for pulling boxscores from nhl dot com and what I want to do is be able to use it for the current season. For example each morning I would like to import the previous days boxscores and add them to the bottom of the list of boxscores that I've already imported. Or if I miss a few days and I update my boxscores it will only add the new ones from the last recorded day to current day. It will save the time of importing every game every day. What I think i need is to insert a 'marker' in col A of each imported row, such as the game number. So when I goto import results the macro can search my data base for the last game # and continue from that point.

                                  I'm trying to figure out a way to code this into the macro, I hope what I'm saying is understandable.
                                  Comment
                                  • oilcountry99
                                    SBR Wise Guy
                                    • 08-29-10
                                    • 707

                                    #297
                                    Originally posted by a4u2fear
                                    Now, let's add the If/for loop above with the other for loop where the webpage sub will be called for each game

                                    Dim totalgames as integer 'i.e. 1230
                                    Dim NHLyear as integer 'i.e. 12, two digit because the year in NHL's website is only two digits
                                    Dim gamenumber as integer

                                    For NHLyear=11 to 13

                                    If NHLyear=12 then
                                    totalgames=720 'shortened season!
                                    else
                                    totalgames=1230 '2011 and 2013 both were full seasons
                                    end if

                                    For gamenumber=1 to totalgames
                                    'execute this code, eventually will be importing a webpage and performing code
                                    Call ImportWebpage(gamenumber) ' this sub does not exist yet
                                    Next

                                    Next
                                    As a continuation of my last post. Is there a way to make the line highlighted in blue above dynamic. For example can the '1' above be changed to find the last recorded game number in sheet 1? I understand we would have to add a game number marker to each game, which I was able to do. I just can't seem to get the code to pick up where it left off the next time I come back to update. There is maybe another way as well this was just a thought.

                                    Again, my goal is to use this during the current season, I hate to have to do this but I'm having a hard time finding a source for NHL TEAM HOME and AWAY SPLIT STATS, leading me to create my own data
                                    Last edited by oilcountry99; 09-10-15, 03:21 PM.
                                    Comment
                                    • jonal
                                      SBR Wise Guy
                                      • 06-01-09
                                      • 772

                                      #298
                                      Originally posted by a4u2fear
                                      Application.ScreenUpdating = False

                                      By not enabling screen updating, the code will run faster. You will see importing many pages takes a while.

                                      In the webpage sub, you will see I created a string called gamestring. I needed to do this because if I had added these 0s to the game variable which is an integer, adding 00 to 11 would still make game = 11 and it would not work. By using a string variable, which is text, turns the 11 variable to 0011. NHL.coms game numbers have four digits (because 1230 games) and now our string is the perfect length.

                                      If you are feeling good about your code, go ahead and take a look at the page you imported and what you variables you want to move into your main sheet and start filling in code in PullData sub.

                                      I'm too hungover from last night to do any more. Hopefully this gets some of you started

                                      Hi,

                                      I'm making a effort to learn web scraping via vba and am a bit of a dummy so please excuse my ignorance. I tried to run the code you put in post #22 and get "Compile Error" Sub or Function not defined with "Call PullBack" highlighted in VBA. I have a feeling this is something simple that I'm not following/not understanding but I'm been learning R for the past few months and the little VBA I do recall is not registering in my head.

                                      I guess what I'm asking is what is vba saying the error is and what does "Call PullBack" do?

                                      Thanks for taking the time for this.
                                      Comment
                                      • jonal
                                        SBR Wise Guy
                                        • 06-01-09
                                        • 772

                                        #299
                                        Also am I better off learning how to webscrape via Python? I have a pretty good grasp of Python and using numpy and pandas but am in the process of an interview for a data analyst role that requires good Excel skills so thought it would be wise to know how to do webscraping in Excel/VBA. I'm new to this stuff but am interested in it.
                                        Comment
                                        • bjb7223
                                          SBR Posting Legend
                                          • 11-03-12
                                          • 10349

                                          #300
                                          a4u2fear are you still around?
                                          Comment
                                          • b_rad_1983
                                            SBR High Roller
                                            • 01-07-13
                                            • 127

                                            #301
                                            Originally posted by bjb7223
                                            a4u2fear are you still around?
                                            We all are.. ask your question.. We may be able to help
                                            Comment
                                            • bjb7223
                                              SBR Posting Legend
                                              • 11-03-12
                                              • 10349

                                              #302
                                              Originally posted by b_rad_1983
                                              We all are.. ask your question.. We may be able to help
                                              I need to know how the scrap SBR NBA Odds in Excel using VBA. I want to create several columns

                                              Date, home Team, Road Team, Closing Total, 2nd Half Total, 1st Quarter Scoring, 2nd Quarter Scoring, 3rd Quarter Scoring, 4th quarter Scoring. From the website below.

                                              Comment
                                              • a4u2fear
                                                SBR Hall of Famer
                                                • 01-29-10
                                                • 8147

                                                #303
                                                Originally posted by bjb7223
                                                I need to know how the scrap SBR NBA Odds in Excel using VBA. I want to create several columns

                                                Date, home Team, Road Team, Closing Total, 2nd Half Total, 1st Quarter Scoring, 2nd Quarter Scoring, 3rd Quarter Scoring, 4th quarter Scoring. From the website below.

                                                https://www.sportsbookreview.com/bet...?date=20171021
                                                have you tried reading the first page of posts? I believe i gave you entirely what you need, all you have to do is switch the NHL website page with the page above.
                                                Comment
                                                • BetterBizness
                                                  SBR Hall of Famer
                                                  • 05-20-06
                                                  • 5737

                                                  #304
                                                  For those that currently scrape, is this still working?
                                                  Comment
                                                  SBR Contests
                                                  Collapse
                                                  Top-Rated US Sportsbooks
                                                  Collapse
                                                  Working...