Database, scraping and correlation

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • statnerds
    SBR MVP
    • 09-23-09
    • 4047

    #1
    Database, scraping and correlation
    forgive the title. just had this question in my head.

    is there a correlation between 1st H and 2nd H scoring in NCAA Hoops?

    if no one had the answer, if i started a database and wrote some code to scrape thousands and thousands of box scores, would i have the tools to gather specific data needed to answer my question?

    i would want to go just points scored in 1st H and then 2nd H

    and i would also want to go points scored in relation to 1st H and game total compared to 2nd H.

    if that was ambiguous or hard to follow let me know.

    thanks for any input.
  • Wrecktangle
    SBR MVP
    • 03-01-09
    • 1524

    #2
    I'd be very surprised if there was not as every other sport I've looked at has so far.
    Comment
    • statnerds
      SBR MVP
      • 09-23-09
      • 4047

      #3
      Thanks Wreck

      i looked through about a weeks worth before just to get an idea and over 70% of the time, the 2nd H featured more points than the 1st. however, as the 1st H total points went to 70 and then 75 and up, the ratio came way down.

      can i get that specific data writing my own code?
      Comment
      • Bsims
        SBR Wise Guy
        • 02-03-09
        • 827

        #4
        I looked at the last 3 seasons with over 4000 games per season. Following are the average scores;

        Season 1H 2H OT
        2007-08 32.1 36.1 12.2
        2008-09 31.6 35.7 12.1
        2009-10 32.0 36.1 12.2
        Comment
        • ljump12
          SBR High Roller
          • 12-08-09
          • 113

          #5
          Originally posted by statnerds
          Thanks Wreck

          i looked through about a weeks worth before just to get an idea and over 70% of the time, the 2nd H featured more points than the 1st. however, as the 1st H total points went to 70 and then 75 and up, the ratio came way down.

          can i get that specific data writing my own code?
          You can get whatever data you want, as long as it exists somewhere. If you can access it in your web browser, you can write code to get them.
          Comment
          • trixtrix
            Restricted User
            • 04-13-06
            • 1897

            #6
            i would think points scored in 1st half would correlate to points being scored in 2nd half
            Comment
            • Wrecktangle
              SBR MVP
              • 03-01-09
              • 1524

              #7
              Typically you need to include the line on the game as better teams (typically indicated by being a betting favorite) tend to score more in the second half when behind in the 1st, and vice versa. This was the correlation I was thinking of in my earlier post.
              Comment
              • Emily_Haines
                SBR Posting Legend
                • 04-14-09
                • 15917

                #8
                You do realized that if the 1st half is high scoring that the books adjust the 2nd half line based on the scoring in the first half and not go based on the total for the game.
                Comment
                • statnerds
                  SBR MVP
                  • 09-23-09
                  • 4047

                  #9
                  i tracked games for a while starting with the game Total minus the 1st H Total to get an idea on 2nd H total. regardless of the points scored, the 2nd H Total was within 2.5 pts or less of the 2nd H line i came up with before the game started. so it seems the books trust their models and adjust very little based on the action of the 1st H.

                  i would want to get very specific:

                  1st H Total
                  Game Total
                  1st H actual score expressed as a % of the game total and a % compared to the 2nd H total

                  i would also want simple
                  1st H score
                  Avg 2nd H score produced for any specific 1st H total (say the avg 2nd H score of any 1st H that scored 61 pts.)

                  just in the small sample size of 222 games i checked, the 2nd H scored more, and by an avg of nearly 10 pts.

                  however, as that 1st H score went higher, the % ticked down slightly. i think long term 2nd H totals would provide more opportunities than the game totals.
                  Comment
                  • stephtop
                    SBR Rookie
                    • 04-07-10
                    • 9

                    #10
                    Suggestions on mlb first half betting the totals. seems more scoring is done late.... anyone like to chime in
                    Comment
                    • skrtelfan
                      SBR MVP
                      • 10-09-08
                      • 1913

                      #11
                      Not sure exactly what you're asking, but the 1h/2h split in NCAA hoops is around 46.5/53.5, but with lowered totaled games it's closer to 46/54 and with higher totaled games it's closer to 47/53.
                      Comment
                      • Dunder
                        Restricted User
                        • 10-26-09
                        • 3345

                        #12
                        There is a correlation, yes.
                        To be useful though, other factors need to be accounted for (score difference at end of 1H, closing spread).

                        Think about it. If 50 points were scored in two games in 1H:

                        Team A was favoured by 14 and leads 33-17 at the end of the first half
                        Team B was favoured by 2 and leads 26-24 at the end of the first half

                        Even if you ignore the total expectation before the game, would you expect the 2H scoring of these two games to be the same?
                        Comment
                        • idontlikerocks
                          SBR Wise Guy
                          • 10-09-07
                          • 571

                          #13
                          you may want to look at a couple of other factors here. one, is the game significant? is it a conference game? is it between ranked teams? two, what was the PACE of the first half? how many shots were taken and how many free throws? third, what is the score spread entering the second half? is it a close game? if you narrow your data to these categories you will find a system that pays.
                          Comment
                          • dwaechte
                            SBR Hall of Famer
                            • 08-27-07
                            • 5481

                            #14
                            Good discussion so far.

                            As statnerds brought up, 1H outcomes have very miniscule effects on 2H outcomes. There is a correlation, but it's small.

                            idlrocks has brought up the main issue when it comes to 2H totals being adjusted from the pregame 2H total... pace. Off and def efficiency are very easy to predict for any matchup, but the pace is less predictable, so a lot can often be learned from the pace of the 1H. Obviously, you would also need to adjust for typical 2H pace differences based on score in the same way you would adjust for 2H spread differences based on score as others have outlined.
                            Comment
                            SBR Contests
                            Collapse
                            Top-Rated US Sportsbooks
                            Collapse
                            Working...