Help on running an excel macro

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mebaran
    SBR MVP
    • 09-16-09
    • 1540

    #1
    Help on running an excel macro
    I have a website that I am planning on scraping for closing lines. I can already run a macro that gets a table from said website, but I need a macro that will do every single table that I need.

    Ex: I can web query http://nameofsite0001 and record a macro of me doing it, but how do i create a macro that gets the same tables from http://nameofsite0002 thru http://nameofsite-3000?

    I hope I am being clear enough. I can also provide html of the webpage and the code from my intial macro. I would appreciate any help.
  • Blax0r
    SBR Wise Guy
    • 10-13-10
    • 688

    #2
    Could you post your macro's vb code?
    Comment
    • mebaran
      SBR MVP
      • 09-16-09
      • 1540

      #3
      Sub Macro1()
      '
      ' Macro1 Macro
      '

      '
      With ActiveSheet.QueryTables.Add(Connection:= _
      "URL;http://www.bigguy.com/main/welcome.html?screen=livepast&d3=2000-01-01", _
      Destination:=Range("$A$1"))
      .Name = "welcome.html?screen=livepast&d3=200 0-01-01"
      .FieldNames = True
      .RowNumbers = False
      .FillAdjacentFormulas = False
      .PreserveFormatting = True
      .RefreshOnFileOpen = False
      .BackgroundQuery = True
      .RefreshStyle = xlInsertDeleteCells
      .SavePassword = False
      .SaveData = True
      .AdjustColumnWidth = True
      .RefreshPeriod = 0
      .WebSelectionType = xlEntirePage
      .WebFormatting = xlWebFormattingNone
      .WebPreFormattedTextToColumns = True
      .WebConsecutiveDelimitersAsOne = True
      .WebSingleBlockTextImport = False
      .WebDisableDateRecognition = False
      .WebDisableRedirections = False
      .Refresh BackgroundQuery:=False
      End With
      End Sub

      Basically, would like to get 2000-01-01 through today's closing lines. Forgive me if my code looks funky
      Comment
      • Blax0r
        SBR Wise Guy
        • 10-13-10
        • 688

        #4
        here you go; I tested it out w/ a small range and the macro seemed work alright. (the tabbing got messed up in the copy/paste just fyi)

        Also, I wouldn't run the code below right away; you may get a lot of blanks if those dates are only for a particular sport. It may be cleaner if you manually figure out the right start/end dates per season (which shouldn't be too bad).

        Sub Macro1()
        '
        ' Macro1 Macro
        '

        '

        'use these specify your date range
        Dim startDate As Date
        Dim endDate As Date

        'represents 1/1/2001
        startDate = 36526

        'represents 2/22/2011
        endDate = 40596

        'loop over our range w/ a step of 1 (1 day at a time)
        For datecounter = startDate To endDate Step 1

        'just replacing the date text in the url w/ our datecounter variable (w/ the right formatting)
        With ActiveSheet.QueryTables.Add(Connection:= _
        "URL;http://www.bigguy.com/main/welcome.html?screen=livepast&d3=" & Format(datecounter, "yyyy-MM-dd"), _
        Destination:=Range("$A$1"))
        .Name = "welcome.html?screen=livepast&d3=" & Format(datecounter, "yyyy-MM-dd")
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlEntirePage
        .WebFormatting = xlWebFormattingNone
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False
        End With

        'create new sheet and make it active
        Sheets.Add After:=Sheets(Sheets.Count)
        Next
        End Sub
        Last edited by Blax0r; 02-22-11, 06:17 PM.
        Comment
        • mebaran
          SBR MVP
          • 09-16-09
          • 1540

          #5
          Wow, yep, that works pretty well. I really appreciate it...until I can learn some more by myself, this will do fantastic for my purpose.

          Points incoming.
          Comment
          • Blax0r
            SBR Wise Guy
            • 10-13-10
            • 688

            #6
            Thank you sir.

            Feel free to post in this thread or PM me if any problems arise.
            Comment
            • SportsbetTracker
              SBR Rookie
              • 04-30-10
              • 26

              #7
              I would be careful with multiple looping queries that constantly ping the source data site. If you hit a site too fast and too often, they may get your IP address and block access from the site. In general, create a looping delay of at least 10 seconds per grab, that should keep you under the radar. In addition, vary the access time, say 8 seconds from the previous grab, then 15 seconds from the previous grab, etc.

              Just to let you know, more and more sites are becoming aggressive in their anti-scraping technologies and legalities. For instance, a huge flag would be a hit of the website that does NOT show a standard browser like ID, Chrome, Firefox, Opera, or the like as the requestor.

              In some cases, the site techs flag EVERY request that comes in without a standard browser, which typically is less than 1% of the traffic, but in some cases account for 25% of the bandwidth usage. Of course, such usage does not expose advertising to the user which the source site depends on, so they are very acute about monitoring non-browser requests. When I must scrape data from many pages of the same site, I always use a client requestor to stay a bit below the radar. Of course, that still doesn't cover up the actual quantity of hits, so your IP will still show at or near the top of site hits, which is then scrutinized for the patterns shown above.
              Comment
              • mebaran
                SBR MVP
                • 09-16-09
                • 1540

                #8
                Originally posted by SportsbetTracker
                I would be careful with multiple looping queries that constantly ping the source data site. If you hit a site too fast and too often, they may get your IP address and block access from the site. In general, create a looping delay of at least 10 seconds per grab, that should keep you under the radar. In addition, vary the access time, say 8 seconds from the previous grab, then 15 seconds from the previous grab, etc.

                Just to let you know, more and more sites are becoming aggressive in their anti-scraping technologies and legalities. For instance, a huge flag would be a hit of the website that does NOT show a standard browser like ID, Chrome, Firefox, Opera, or the like as the requestor.

                In some cases, the site techs flag EVERY request that comes in without a standard browser, which typically is less than 1% of the traffic, but in some cases account for 25% of the bandwidth usage. Of course, such usage does not expose advertising to the user which the source site depends on, so they are very acute about monitoring non-browser requests. When I must scrape data from many pages of the same site, I always use a client requestor to stay a bit below the radar. Of course, that still doesn't cover up the actual quantity of hits, so your IP will still show at or near the top of site hits, which is then scrutinized for the patterns shown above.
                Am aware of this, but like I was telling Boxor, I'm still new at programming, so there are only a few options for me at the moment. Appreciate the input. Would you by any chance know how to put a delay into the macro? Or is there a website that would show me how? Thanks
                Comment
                • TCMBob
                  SBR Rookie
                  • 01-16-11
                  • 43

                  #9
                  I'd be interested in the website info also, if it exists.
                  Comment
                  • WendysRox
                    SBR High Roller
                    • 07-22-10
                    • 184

                    #10
                    love this nerdy type of thread. I actually use a macro in excel to scrape some data.
                    Comment
                    SBR Contests
                    Collapse
                    Top-Rated US Sportsbooks
                    Collapse
                    Working...