Excel based tutorial for web scraping

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • oilcountry99
    SBR Wise Guy
    • 08-29-10
    • 707

    #246
    I've got another challenge if there are any takers....

    Conditional Formatting:
    I'm trying to get a formula that I can input into conditional formatting that will compare the value in cell E2 with the cell values in F2,G2,H2 and fill the cell which has the closest value to the value in E2.

    E2 F2 G2 H2
    9 7.2 5.6 14.7

    So in the example above cell F2 containing 7.2 would be filled with colour as it is the closest to the value of cell E2 which is 9

    Again seems simple in theory but I can't get it working properly.

    Thanks
    Comment
    • rufflesmuncher
      SBR Wise Guy
      • 04-02-11
      • 610

      #247
      Let me know if you figure this one out. Working on my excel skills, gotta land a good job in accounting this year
      Comment
      • EXhoosier10
        SBR MVP
        • 07-06-09
        • 3122

        #248
        =min(abs(f2-e2),abs(g2-e2),abs(h2,e2))
        Comment
        • oilcountry99
          SBR Wise Guy
          • 08-29-10
          • 707

          #249
          Originally posted by EXhoosier10
          =min(abs(f2-e2),abs(g2-e2),abs(h2,e2))
          Thanks for the input, however I get the error "You've entered too many arguments for this function."
          Comment
          • oilcountry99
            SBR Wise Guy
            • 08-29-10
            • 707

            #250
            Originally posted by oilcountry99
            Thanks for the input, however I get the error "You've entered too many arguments for this function."
            EXhoosier10
            I got rid of the error, but the conditional format highlights cells F2,G2,H2, instead of just F2 as this is the one closest to E2
            Comment
            • rufflesmuncher
              SBR Wise Guy
              • 04-02-11
              • 610

              #251
              Let me know if you figure this one out. Working on my excel skills, gotta land a good job in accounting this year
              Comment
              • rufflesmuncher
                SBR Wise Guy
                • 04-02-11
                • 610

                #252
                Dunno why that posted again. My bad
                Comment
                • oilcountry99
                  SBR Wise Guy
                  • 08-29-10
                  • 707

                  #253
                  Originally posted by EXhoosier10
                  =min(abs(f2-e2),abs(g2-e2),abs(h2,e2))
                  is it supposed to say (h2,e2) as shown above or (h2-e2) ? I assume (h2-e2) because the other gave the error "You've entered too many arguments for this function."
                  Comment
                  • oilcountry99
                    SBR Wise Guy
                    • 08-29-10
                    • 707

                    #254
                    This formula for conditional format is killing me. I've been search and trying everything....it seems like a simple, common request. To highlight the value closest to a given cell value....this is the common formula I find:

                    =INDEX(F2:H2,MATCH(MIN(ABS(F2:H2-E2)),ABS(F2:H2-E2),0))

                    but it highlights all cells in the range similar to what hoosiers solution did.

                    If I enter this formula in a random cell it will return the correct answer, so it does work, but it won't highlight the correct cell when put into conditional formatting, highlights all cells in range F2:H2

                    Damn excel, definite love/hate relationship here!
                    Comment
                    • thom321
                      SBR High Roller
                      • 06-17-11
                      • 112

                      #255
                      The workbook I attached has the easiest solution I could think of. However, it might be too inflexible for your needs but there are always many ways to accomplish the same thing in Excel.

                      Rather than using long and complicated formulas, I find that it is much easier to troubleshoot if the individual components are calculated separately e.g. use MATCH in cell A1 to find the row, then use e.g. INDEX(A2:F2, A1). That is just a general example and not specific to your problem but that is why I made the ABS and MIN calculations in separate cells rather than making it all into one longer, harder to edit, formula.



                      Originally posted by oilcountry99
                      This formula for conditional format is killing me. I've been search and trying everything....it seems like a simple, common request. To highlight the value closest to a given cell value....this is the common formula I find:

                      =INDEX(F2:H2,MATCH(MIN(ABS(F2:H2-E2)),ABS(F2:H2-E2),0))

                      but it highlights all cells in the range similar to what hoosiers solution did.

                      If I enter this formula in a random cell it will return the correct answer, so it does work, but it won't highlight the correct cell when put into conditional formatting, highlights all cells in range F2:H2

                      Damn excel, definite love/hate relationship here!
                      Attached Files
                      Comment
                      • oilcountry99
                        SBR Wise Guy
                        • 08-29-10
                        • 707

                        #256
                        I attached an example of my sheet. It shows only one game of the 100's i will have. I've made several predictions or estimates as you'll see in cells f2:h2 labeled est1, est2, est3. I want to highlight the number that is the closest match to the final score of 9 in cell E2. I want to do this for both teams final score and first five inning scores. I'm hoping this example will help further. What you have done in your example works but i would need many helper cells which is doable but creates a bit of a mess.
                        Attached Files
                        Comment
                        • thom321
                          SBR High Roller
                          • 06-17-11
                          • 112

                          #257
                          Well, it got a bit more complicated than I thought but I have attached my crack at it. The key is to enter the conditional formatting using the formula option like in the link below.



                          I copied the formatting down to row 356 and put a border around the section where the formatting applies.

                          As far as keeping a sheet clean, once I have done all my supporting calculations in columns that I don't want to see, I simply shift them off the screen or hide the columns.



                          Originally posted by oilcountry99
                          I attached an example of my sheet. It shows only one game of the 100's i will have. I've made several predictions or estimates as you'll see in cells f2:h2 labeled est1, est2, est3. I want to highlight the number that is the closest match to the final score of 9 in cell E2. I want to do this for both teams final score and first five inning scores. I'm hoping this example will help further. What you have done in your example works but i would need many helper cells which is doable but creates a bit of a mess.
                          Attached Files
                          Comment
                          • oilcountry99
                            SBR Wise Guy
                            • 08-29-10
                            • 707

                            #258
                            Originally posted by thom321
                            Well, it got a bit more complicated than I thought but I have attached my crack at it. The key is to enter the conditional formatting using the formula option like in the link below.



                            I copied the formatting down to row 356 and put a border around the section where the formatting applies.

                            As far as keeping a sheet clean, once I have done all my supporting calculations in columns that I don't want to see, I simply shift them off the screen or hide the columns.

                            This works, i didn't realize it would be so complicated to achieve something that seems so simple. Maybe there is an easier way but this will serve my purpose.

                            Thanks for your efforts! Impressive
                            Comment
                            • oilcountry99
                              SBR Wise Guy
                              • 08-29-10
                              • 707

                              #259
                              I have a column of conditionally formatted values. The conditional format shades the cell green when certain criteria are met. I want to count all of the green cells in the column to get a total # of green cells.

                              Anyone know how to do this?
                              Comment
                              • thom321
                                SBR High Roller
                                • 06-17-11
                                • 112

                                #260
                                Assuming the structure of the sheet is the same as the one I made, rather than counting each cell that is green, count the cells that meet the condition that makes the cell highlight green. I added it for "Final score" in the workbook I have attached.


                                Originally posted by oilcountry99
                                I have a column of conditionally formatted values. The conditional format shades the cell green when certain criteria are met. I want to count all of the green cells in the column to get a total # of green cells.

                                Anyone know how to do this?
                                Attached Files
                                Comment
                                • a4u2fear
                                  SBR Hall of Famer
                                  • 01-29-10
                                  • 8147

                                  #261
                                  Originally posted by oilcountry99
                                  I have a column of conditionally formatted values. The conditional format shades the cell green when certain criteria are met. I want to count all of the green cells in the column to get a total # of green cells.

                                  Anyone know how to do this?
                                  through code or not? Not using code - you can filter columns by color of cells, then just highlight all of those cells and in the bottom right of Excel it will tell you how many you highlighted.
                                  Comment
                                  • a4u2fear
                                    SBR Hall of Famer
                                    • 01-29-10
                                    • 8147

                                    #262
                                    anyone still coding? looks like both of your sheets do not have any VBA in them.
                                    Comment
                                    • thom321
                                      SBR High Roller
                                      • 06-17-11
                                      • 112

                                      #263
                                      To count highlighted cells, the following code works for me

                                      Once the code has been added to a module, each function can either be used as a regular Excel function in a spreadsheet cell or as a function in other VBA code.

                                      FIRST FUNCTION:

                                      Function CountHighlightedCells(rngCells As Range, lColorNumber As Long, Optional bAnyColor As Boolean = False) As Long
                                      'Counts the number of cells in the selected range that are highlighted with the color specified using
                                      'lColorNumber. Use GetColorNumber function to retrieve the color number for a specified cell.


                                      'If bAnyColor is set to True, this function will count any cell that is highlighted in any color.
                                      'If bAnyColor is set to True, the lColorNumber variable will be ignored regardless of what it is set to


                                      Dim rng As Range
                                      Dim lCount As Long
                                      Dim lCellColor As Long


                                      For Each rng In rngCells
                                      lCellColor = rng.Interior.Color
                                      If bAnyColor Then 'count cells that are highlighted with any color
                                      If rng.Interior.Pattern = xlNone Then
                                      lCount = lCount + 1
                                      End If
                                      Else 'if a specific color number should be used

                                      If lCellColor = lColorNumber Then
                                      lCount = lCount + 1
                                      End If
                                      End If
                                      Next


                                      CountHighlightedCells = lCount


                                      End Function

                                      SECOND FUNCTION:

                                      Function GetColorNumber(rng As Range) As Long
                                      'returns the color number for the specified range. Use it to get the correct color number to use
                                      'with CountHighlightedCells function


                                      GetColorIndex = rng.Interior.Color


                                      End Function

                                      Originally posted by a4u2fear
                                      anyone still coding? looks like both of your sheets do not have any VBA in them.
                                      Comment
                                      • oilcountry99
                                        SBR Wise Guy
                                        • 08-29-10
                                        • 707

                                        #264
                                        I will incorporate the vba when I get a chance. I found something similar online but was unsuccessful implementing it. I'll try your version.

                                        Comment
                                        • oilcountry99
                                          SBR Wise Guy
                                          • 08-29-10
                                          • 707

                                          #265
                                          I have some code that brings in web data, performs some calculations and then copies and pastes results to another sheet.

                                          Then the web data from the calculation sheet is deleted and the cells with the formulas remain. The problem I'm having is once the web data is deleted the formula results remain. So when the next loop runs and brings in the next web data set my numbers in the calculation/formula cells remain with the previous calculation results in them. So when I copy and paste to the other sheet I end up with a duplicate of what was copied and pasted the previous time.

                                          I hope that makes sense. All I want is when each data set comes in it performs the calculations and copy and pastes the data to another sheet.

                                          Any tips or advice as to what I may be doing wrong?

                                          TIA
                                          Comment
                                          • thom321
                                            SBR High Roller
                                            • 06-17-11
                                            • 112

                                            #266
                                            Sounds like you have calculations set to manual (xlCalculationManual) which it should be to make the code run faster. If that is the issue, insert the word Calculate after the web data has been imported but before you copy the data. So the code would be:

                                            code to bring in web data

                                            Calculate

                                            code to copy data to another sheet

                                            Originally posted by oilcountry99
                                            I have some code that brings in web data, performs some calculations and then copies and pastes results to another sheet.

                                            Then the web data from the calculation sheet is deleted and the cells with the formulas remain. The problem I'm having is once the web data is deleted the formula results remain. So when the next loop runs and brings in the next web data set my numbers in the calculation/formula cells remain with the previous calculation results in them. So when I copy and paste to the other sheet I end up with a duplicate of what was copied and pasted the previous time.

                                            I hope that makes sense. All I want is when each data set comes in it performs the calculations and copy and pastes the data to another sheet.

                                            Any tips or advice as to what I may be doing wrong?

                                            TIA
                                            Comment
                                            • oilcountry99
                                              SBR Wise Guy
                                              • 08-29-10
                                              • 707

                                              #267
                                              thom321

                                              thank you! that fixed it

                                              Comment
                                              • oilcountry99
                                                SBR Wise Guy
                                                • 08-29-10
                                                • 707

                                                #268
                                                I have 3 columns

                                                COL A has the team name
                                                COL K has "Home" or "Away"
                                                COL M has "W" or "L"

                                                What i would like to do is calculate Home Win streak, Home Lose streak, Road Win Streak and Road Lose streak as the current season progresses. I'm able to do Win and Lost streak but what I havent' been able to do is break the streaks up into Home and Road.

                                                Any suggestions here?

                                                TIA
                                                Comment
                                                • thom321
                                                  SBR High Roller
                                                  • 06-17-11
                                                  • 112

                                                  #269
                                                  Create four new columns, "Home Win", "Home Lose", "Road Win" and "Road Lose"

                                                  For column "Home Win" set the formula to (assuming you have data in row 3):
                                                  IF(AND(K3="Home",M3="W"),"W","L")

                                                  That gives you a Win/Loss column for Home Wins that you should be to calculate win/loss streaks on the same way you did for your original win/loss column.

                                                  Repeat for the remaining three columns, i.e. first create the win/loss data by using an IF/AND formula, then calculate win/loss streak.
                                                  Comment
                                                  • oilcountry99
                                                    SBR Wise Guy
                                                    • 08-29-10
                                                    • 707

                                                    #270
                                                    Originally posted by thom321
                                                    Create four new columns, "Home Win", "Home Lose", "Road Win" and "Road Lose"

                                                    For column "Home Win" set the formula to (assuming you have data in row 3):
                                                    IF(AND(K3="Home",M3="W"),"W","L")

                                                    That gives you a Win/Loss column for Home Wins that you should be to calculate win/loss streaks on the same way you did for your original win/loss column.

                                                    Repeat for the remaining three columns, i.e. first create the win/loss data by using an IF/AND formula, then calculate win/loss streak.
                                                    Thanks thom321, I was on the right track with what you stated except I was only calculating the wins under 'home wins' and not the losses as well, this should help solve the issue.

                                                    The other challenge I have is all the teams results are on 1 sheet, there for need to incorporate something to start a new streak when the team changes in COL A, maybe something like IF(A3=A2,formula if true,"0")

                                                    I'll give it a whirl, thanks again.
                                                    Comment
                                                    • oilcountry99
                                                      SBR Wise Guy
                                                      • 08-29-10
                                                      • 707

                                                      #271
                                                      I gave this a shot and the problem is, for example under the Home Wins column:

                                                      I end up getting an L if the game is a Home Loss or a Road game. How would I eliminate the "Road" results from the "Home Wins" column?

                                                      Under the "Home Wins" column I should have a blank or a "0" as a place holder for road results.
                                                      Comment
                                                      • oilcountry99
                                                        SBR Wise Guy
                                                        • 08-29-10
                                                        • 707

                                                        #272
                                                        Ideally vba formulas would probably be best.

                                                        Formula(HomeWinStreak)
                                                        Formula(HomeLossStreak)
                                                        Formula(RoadWinStreak)
                                                        Formula(RoadLossStreak)

                                                        Or even better 1 formula to do it all Formula(Streak)

                                                        Trick is how to write them . Maybe a4u2fear could chime in
                                                        Comment
                                                        • thom321
                                                          SBR High Roller
                                                          • 06-17-11
                                                          • 112

                                                          #273
                                                          I was a bit sloppy when writing the first response. The formula should have been:
                                                          IF(AND(K3="Home",M3="W"),"W","")

                                                          i.e. just change the "L" to "" or 0 or whatever you want it to be.

                                                          I don't have time to write the VBA for it but maybe someone else has.

                                                          Originally posted by oilcountry99
                                                          I gave this a shot and the problem is, for example under the Home Wins column:

                                                          I end up getting an L if the game is a Home Loss or a Road game. How would I eliminate the "Road" results from the "Home Wins" column?

                                                          Under the "Home Wins" column I should have a blank or a "0" as a place holder for road results.
                                                          Comment
                                                          • ZenChi
                                                            SBR Rookie
                                                            • 11-23-11
                                                            • 17

                                                            #274
                                                            Originally posted by akphidelt
                                                            Awesome thread guys. Hope I can help. Expert in excel. Also produced web programs, was on another site with an NBA tracker. Have it on www.akptracker.com. Can scrape any data off any site.

                                                            B_rad, I promise you if you took the time to do this through php and MySQL you would never use excel again. And it's completely free to do. Just download Xampp from apachefriends.org. Either way, hope I can help. Love this stuff.

                                                            Akphidelt would like to speak more with you in regards to the work you have done - I dont have private message capabilities yet so I can give email if you are interested. Thanks
                                                            Comment
                                                            • b_rad_1983
                                                              SBR High Roller
                                                              • 01-07-13
                                                              • 127

                                                              #275
                                                              Hey guys!

                                                              Looks like I missed quite a bit
                                                              Comment
                                                              • b_rad_1983
                                                                SBR High Roller
                                                                • 01-07-13
                                                                • 127

                                                                #276
                                                                Has anyone ever had their code do so much stuff it goes to not responding?
                                                                Over a full nhl season i need to pull in about a month at a time. it will not handle the full 6- 7 months.
                                                                Comment
                                                                • thom321
                                                                  SBR High Roller
                                                                  • 06-17-11
                                                                  • 112

                                                                  #277
                                                                  b_rad_1983,

                                                                  How are you importing it?

                                                                  If you are using VBA then the amount of data should not matter if coded properly. However, it can take some tweaking.

                                                                  If you are using formulas or queries, then it is pretty easy to overwhelm Excel so that it hangs and the way around that is to use VBA.
                                                                  Comment
                                                                  • b_rad_1983
                                                                    SBR High Roller
                                                                    • 01-07-13
                                                                    • 127

                                                                    #278
                                                                    Hi thom321,
                                                                    I ended up re- coding my vba and seems to be working better,
                                                                    i had too much stuff out of place.
                                                                    Comment
                                                                    • b_rad_1983
                                                                      SBR High Roller
                                                                      • 01-07-13
                                                                      • 127

                                                                      #279
                                                                      I would like to know if there is a way to only import the "Summary" table of the following link,



                                                                      I do not what the whole page for that day, only the summer
                                                                      Comment
                                                                      • b_rad_1983
                                                                        SBR High Roller
                                                                        • 01-07-13
                                                                        • 127

                                                                        #280
                                                                        here is my basic import layout


                                                                        For gameday = 63274 To 63849


                                                                        Sheets("Temp").Select
                                                                        With Sheets("Temp").QueryTables.Add(Connectio n:= _
                                                                        "URL;http://www.covers.com/pageLoader/pageLoader.aspx?page=/data/nhl/results/2014-2015/boxscore" & gameday & ".html", Destination:= _
                                                                        Range("$A$1"))
                                                                        .Name = test
                                                                        .FieldNames = True
                                                                        .RowNumbers = False
                                                                        .FillAdjacentFormulas = False
                                                                        .PreserveFormatting = True
                                                                        .RefreshOnFileOpen = False
                                                                        .BackgroundQuery = True
                                                                        .RefreshStyle = xlInsertDeleteCells
                                                                        .SavePassword = False
                                                                        .SaveData = True
                                                                        .AdjustColumnWidth = True
                                                                        .RefreshPeriod = 0
                                                                        .WebSelectionType = xlEntirePage
                                                                        .WebFormatting = xlWebFormattingrts
                                                                        .WebPreFormattedTextToColumns = True
                                                                        .WebConsecutiveDelimitersAsOne = True
                                                                        .WebSingleBlockTextImport = False
                                                                        .WebDisableDateRecognition = True
                                                                        .WebDisableRedirections = False
                                                                        .Refresh BackgroundQuery:=False
                                                                        End With
                                                                        Comment
                                                                        SBR Contests
                                                                        Collapse
                                                                        Top-Rated US Sportsbooks
                                                                        Collapse
                                                                        Working...