1. #1
    nash13
    nash13's Avatar Become A Pro!
    Join Date: 01-21-14
    Posts: 1,122
    Betpoints: 7160

    Calculating Pace in different sports

    Hi everyone,
    I'd like to discuss how to calculate pace of the game for several sports.
    I found a basic formula for possession in basketball: (0.96*((field goals attempted)+(turnovers)+0.44*(free throws attempted)-(offensive rebounds))). Are there any other metrics? For football or hockey esp.
    Thanks.

  2. #2
    destruction88
    destruction88's Avatar Become A Pro!
    Join Date: 04-06-19
    Posts: 12
    Betpoints: 26

    TLDR; You may already be aware but considering how simple that formula is, I would recommend you go on YouTube and watch PART of a game of netball. No need to watch a whole match unless you like Aussie girls in short skirts.

    SPECIFICALLY, watch what happens after a goal is scored. Now multiply that by x100 goals per game. This same phenomenon, occurs in basketball and a myriad of other sports. Soccer compensates for this, basketball only very slightly on specific quarters but mostly not. And, the multiply by x100 will be pretty close in basketball as well.

    This is a very valuable component of my model which I do not want to give away to all of the public. So hopefully you can read between the lines.
    Last edited by destruction88; 04-08-19 at 06:45 PM. Reason: SBR likes to strip out line-breaks.

  3. #3
    destruction88
    destruction88's Avatar Become A Pro!
    Join Date: 04-06-19
    Posts: 12
    Betpoints: 26

    In models with a time component:

    Slow-paced VS slow-paced is not as super-low scoring as model will suggest, and
    Fast-paced VS fast-paced is not as super-high scoring as model will suggest.

    If a model does not factor in time, then that is improvement that needs to be made.

    Even tennis despite its unique scoring structure is affected by time. Consider 5th set fatigue out in the hot sun in Australian Open for instance. Time is universal in all sports even if there is no clock on screen.

    Imagine a soccer league where EVERY time a goal was scored, the team celebrates for 30 minutes. The clock would keep running during this time AND no minutes at the end of each half was added.

    No matter how fast-paced each team is, even if every player is the Flash... the maximum is THREE if the time-keeper is perfectly accurate and players always celebrate for exactly 30 minutes.

    In a normal regression based model, if players move at the speed of light (ignoring physics), the total heads towards infinity. This is fundamentally wrong and will affect predictions in a non-hypothetical league even when teams only have x1.05 or x1.10 pace-multipliers against the league average.

    I would personally go so far as to say that a basic fundamental model that understands how the game/pace pieces together will always do better than a basic regression (like the basic formula in OP).

  4. #4
    destruction88
    destruction88's Avatar Become A Pro!
    Join Date: 04-06-19
    Posts: 12
    Betpoints: 26

    Based on intuition, I do not think a FGA should be weighted with a 1:1 ratio as a turnover. Maybe I am wrong.

    Eg: Team-A scores, now the average time it takes for Team-B to pass the ball in, dribble it up court, allow a steal to occur should be LESS in my mind VS the average time for Team-B will take to do the same but make a shot attempt (FGA) instead of lose the ball, no?


    I would try work out how much time each stat takes eg turnover=11 secs, foul (resulting in FTAs)=13 seconds, 2pt FGA=15.5 seconds, 3pt FGA=14.5 seconds, **** time after successful FGA = 2.5 secs... those specific numbers are made up obviously.

    Then try predict how frequently each event will happen (two teams that are useless at steals is going to have less steals than a typical game) and SIMULATE that. Play-by-play data is available so that should make it easy to get the actual times and frequencies of events occurring one after the other. Approximations can be surprisingly sufficient. Normal frequencies of events should be able to be calculated from the box score.

    The beauty is you don't need to be constantly tracking play-by-play data, there is no need to go rain-man / syndicate level crazy. The aim is just find out what the average time for a possession is dependent on the end outcome (turnover, foul, shot etc), those times will hardly change season-to-season so you can run a simulation.


    Another consideration, even if you are hell-bent on doing a regression is fast-break points. Also, a steal should take much less time for the thief to FGA / be fouled / lose-the-ball-back due to the court position where there possession starts instead of starting from back-court.

    That formula is just too simple and does not understand the mechanics of a game of basketball which is very important for Totals.

    Imagine going back to the hypothetical soccer league, what if one team was a little unique from the others and you found that only 50% of the time they celebrated for 30 minutes and the other 50% they did not celebrate (0 minutes)?

    What if 50% of the time when they do celebrate it is for an AVERAGE of 30 minutes, some times they celebrate for 31 minutes, other times for 29 minutes? Hence the need for a simulator, even a simple one with a lot of assumptions / shortcomings.

  5. #5
    nash13
    nash13's Avatar Become A Pro!
    Join Date: 01-21-14
    Posts: 1,122
    Betpoints: 7160

    That formula was mainly counting on possession.
    The more detailed formula looks like this:
    Pace Formula=[240/(Team Minutes)]*(Possessionteam+Possessionopponent)/2

    Basic Possession Formula=0.96*[(Field Goal Attempts)+(Turnovers)+0.44*(Free Throw Attempts)-(Offensive Rebounds)]

    More Specific Possession Formula=0.5 * ((Field Goal Attempts + 0.4 * Free Throw Attempts - 1.07 * (Offensive Rebounds / (Offensive Rebounds + Opponent Defensive Rebounds)) * (Field Goal Attempts - FG) + Turnovers) + (Opponent Field Goal Attempts + 0.4 * Opponent Free Throw Attempts - 1.07 * (Opponent Offensive Rebounds / (Opponent Offensive Rebounds + Defensive Rebounds)) * (Opponent Field Goal Attempts - Opponent FG) + Opponent Turnovers))

    Defensive Efficiency Fomula=100*(Points Allowed/Possessions). My aim was starting a conversation around this introduction.

  6. #6
    Waterstpub87
    Slan go foill
    Waterstpub87's Avatar Become A Pro!
    Join Date: 09-09-09
    Posts: 4,043
    Betpoints: 7236

    For football, as long as you mean American, and not world:

    You can get seconds per play from football outsiders : https://www.footballoutsiders.com/stats/pacestats

    For college football:

    I calculate this by using several statistics from Teamrankings.com:

    (3600 * Average time of possession)/ (Rush attempts per game + completions per game)

    This would give you seconds per offensive play per team.

    With basketball, I use that formula for college. For NBA, I just pull the advanced team statistics from NBA.com, and use their pace numbers.

  7. #7
    destruction88
    destruction88's Avatar Become A Pro!
    Join Date: 04-06-19
    Posts: 12
    Betpoints: 26

    Do you run a simulation or {just use some back-fitted distribution depending on Line and Total} to work out your edge / correct prices?

  8. #8
    destruction88
    destruction88's Avatar Become A Pro!
    Join Date: 04-06-19
    Posts: 12
    Betpoints: 26

    Trying to be as succinct as possible. How to improve pace / totals prediction. Careful what you wish for.

    EASY: Is the match the Grand Final or part of the true Finals Series or a playoff? What play-off stage, using the World Cup as an example... R16, Q-F, S-F, Third-Place, Final? With adequate sample size like the NBA playoffs, some very juicy angles can be incorporated based on stage. The multipliers can be ****ing huge x0.71, x0.86, x0.92, x1.10, x1.22... I have taken advantage of (those not from basketball but I see no reason for an exception in BB, even x0.96 times 200pts = 8pt edge, yum).

    EASY: Fast-paced vs fast-paced often produces incredibly inflated Totals in many models, definitely worth testing the extremes and the same goes for slow-paced vs slow-paced, or low-scoring vs low-scoring (semantics). Are the actual results as extreme as the Totals component of the model predicts or not extreme enough?

    EASY: Outlier management. These cause newbies probably the most pain and mislead the mugs the most. Exclude, truncate, median instead of mean, median mixed with mean, binary (over/under) % of league average, logarithms / exponential dilution, confidence interval/level are all possible solutions.

    EASY: I hate the term but "mean reversion" in between seasons, though not due to some mystical force as most statisticians make out the reason to be.

    EASY: A team's goal is actually to WIN not play defensive or offensive, or have a great defensive efficiency, a win is a win. Yes the coach might yell at them and force them to do a bunch of defensive drills the next week if they play sloppy in their last match but pace / Totals are not a team's priority so long as they are not fixing matches. Thus noise affects predicting Totals more than it affects the spread. This noise needs to be removed, a confidence level can be used eg a x1.02 pace multiplier is likely just noise (x1.00) and a x1.20 pace multiplier may really be x1.18.

    EASY: Ladder position, are they close on the table or far apart? USUALLY... Close = defensive, far-apart = offensive. What is at stake, are they at the top of the table vying for #1 or at the bottom scared of relegation in non-American sports (big stakes = even more defensive usually). Is one of them or both of them out of playoff contention and nolonger care about D anymore?

    EASY: Stage of season, first round, mid season, last game of regular season, % of season completed. Careful to not confuse lower-scoring / high-scoring during the mid season as something to do with the round IF weather affects the sport. It may not be the round that is significant, it may just be that winter or summer coincides exactly with the mid season.

    EASY: League shifts as @Nash13 is well aware. Is the league getting higher-scoring or are teams becoming more defensive and league average decreasing? Checking for rule changes might be a good way of identifying that but I would suggest using a more automated approach that requires less judgement calls, and less time doing extensive research / reading articles. Careful of Simpson's paradox if the number of games is different each season, very relevant for Internationals and constantly evolving leagues.

    EASY: Rest time: One team coming off a bye. Both teams coming off a prolonged break like International windows, Christmas etc.

    EASY: Time of day: which may just be a byproduct of weather but is worth looking into. Natural light vs artificial light definitely affects human behaviour / performance. Not to mention fatigue, is greater at 7:00pm than 12:00pm earlier in the day. Players may not eat much on game day, so low blood sugar towards the end of the day could be another reason for a difference based on time of day.

    EASY: **** time after scoring. Already mentioned.
    Points Awarded:

    Shev2 gave destruction88 1 Betpoint(s) for this post.


  9. #9
    destruction88
    destruction88's Avatar Become A Pro!
    Join Date: 04-06-19
    Posts: 12
    Betpoints: 26

    OBVIOUSLY: Anything that affects the Spread may also affect the Pace / Total, even if an idea does not affect the Spread or you cannot confidently prove it, still worth testing the idea on Pace / Totals.

    TIME-INTENSIVE: Circumstantial eg lost-last-3-matches vs lost-last-3-matches, same home city / state (derbies). The possiblities here are infinite, this is one thing machine learning can do well but the results still require a lot of filtering to avoid over-fitting and garbage.

    TIME-INTENSIVE: Weather of course.
    - Indoor stadiums generally not affected by weather though.
    - Even if it does not rain during the match, a saturated pitch is largely what slows down outdoor sports (last 12 hours of rainfall more important than amount of rain during match unless it is raining cats and dogs during the match).
    - Humidity also has a massive effect on performance, and the combination with temperature is important too.
    - Temperature effect may not be linear, it can be a parabola, eg cold = low-scoring, warm-hot = normal-scoring, extremely-hot = low-scoring.
    - When testing it is important to not cheat, if on average you bet 4 hours in advance of the match starting, then your model should not know what the weather was during the actual match and it should not know what the weather was the hour before either. Using weather data is actually very hard due to this unless you always bet on Totals when there is only 5-30 minutes before the start of a match or only bet live which may have lower limits.

    TIME-INTENSIVE: To generate the player model, but this should already be factored in the modeler is using a player-based model. If not, this is an argument for creating one. Star players injured may result in blow-outs (high-scoring, missing goal-keeper) or a low-scoring match due to an inability to convert opportunities (eg low FG% or awful replacement striker in soccer).

    TIME-INTENSIVE: Past matchups are important and not just noise. If the Total goes under YOUR MODEL in the last 10 matches between the two teams, you need to find a way of incorporating that WHILE ensuring you are not just reacting to just a random sequence. Are the coaches for both teams the same for all of those 10 or X past matches AND the same for this upcoming match? How many points did the Total actually go under by not just the binary sequence of 10x Under (careful with outliers)? What about how many of the 40! quarters (larger sample) went Under? Are the past 10 games part of the regular season and NOT special? Charity / exhibition / pre-season matches produce a lot of outliers. Then you need to work out what percentage of the time coaches will randomly have a "mood" swing and not play defensively in the upcoming match, this will dampen the past-match-multiplier but improve accuracy. At the same time you might want to consider the spread implication based on past-matchups. IT IS NOT NOISE ALL THE TIME. Some teams have genuine hard-ons vs specific teams, some teams are just really mismatched in play style rendering a consistent bias.

    CONVOLUTED: Referees definitely can impact the pace / Total. Some like to really interfere, and some are blind / allow a free-flowing game. BUT it is important to understand though the overall meta implications of professional sports, they are ENTERTAINMENT, the goal for a TV broadcaster is not to produce an equitable outcome based on the skill sets of each team. Thus most referees may not be malicious, have conscious or even unconscious biases... they may instead actually just be puppets RANDOMLY chosen (depending on the match) and thus forced or manipulated into creating certain outcomes such as keeping games close, favouring star players, favouring the home team, favouring the large population team, creating high scoring riots for the sake of ENTERTAINMENT. Thus there is spurious risk of misattributing the true cause to referees, maybe 99.0% of referees are actually honest and have spines. Haha.

    CONVOLUTED: A bookmaker openly commented on this years ago in an article that one team may control pace more than another eg Alpha team with a pace of x1.20 plays Beta team with a pace of x0.90, the correct pace prediction for the game may not be 1.20 x 0.90 = 1.08, it may turn out to be x1.14 instead. Analysing the standard deviation for pace for each team and see if that predicts their alpha-ness. Alpha teams definitely DO exist*, whether it is possible to detect with the stats available is a different story. *Some teams kill the clock for instance, it does not matter how fast the other team plays, that killed time is not going to change as a result of the other team's pace if possession remains 50/50.

    KINGPIN: Already mentioned. But pace / Totals should always consider time per possession and more specifically the time per possession depending on the outcome / situation of that possession (eg shot, miss, foul, turnover, first down, fourth down, team is losing, game is in garbage time) which can only properly be accounted for in how it affects the distribution and precise final prediction by doing a simulation.

    KINGPIN: Simulation using multiple variable input will make you the kingpin in Totals with solid assumptions will beat the crap out of your competition's amateur regression models with overly simple distribution tables.
    Points Awarded:

    Shev2 gave destruction88 1 Betpoint(s) for this post.


  10. #10
    nash13
    nash13's Avatar Become A Pro!
    Join Date: 01-21-14
    Posts: 1,122
    Betpoints: 7160

    This is an awesome read. Thank you for the great insight and systematic analytics you put into this.
    I can tell you that changing the rules of the game and dynamics in play structure culminates heavily in the betting process.
    The numbers show that transformation from traditional basketball to the modern era crashed the old metric evaluation process of the game.
    While the correlation between scoring and totals increased drastically the totals as a number itself stayed on point.
    But the irony is: if you take 4 numeric values into consideration. Pace. Avg. Off Stats, Avg Defensive Stats and rest. You will get results nearby 55% in the predictive model. And that's on 5000+ games in the last 5 seasons.

  11. #11
    vampire assassin
    vampire assassin's Avatar Become A Pro!
    Join Date: 03-09-18
    Posts: 279
    Betpoints: 9896

    For baskets, why not just read play-by-play, and count that way instead of approximating?

  12. #12
    destruction88
    destruction88's Avatar Become A Pro!
    Join Date: 04-06-19
    Posts: 12
    Betpoints: 26

    Quote Originally Posted by nash13 View Post
    But the irony is: if you take 4 numeric values into consideration. Pace. Avg. Off Stats, Avg Defensive Stats and rest. You will get results nearby 55% in the predictive model. And that's on 5000+ games in the last 5 seasons.
    There is no cap at 55%, certainly not on Totals. Pinnacle won't ban you if you start winning 55.1% of the time on Totals. Even if I naively thought 55% was the long-term hard-cap on spreads then I would at least think 60% was the asymptote on Totals and that would be my goal.

    Years ago I steamed NBA Totals live, the slow book knew what I was doing (sort of, they knew I was up to mischief). All my bets had to be approved when live, which meant if I had to wait 30 seconds for approval then the oyster trader would gain a material advantage over any punters waiting in the queue. Naturally, 98% of the bets would be on the under. So the trader when deciding whether to accept or reject the bet would wait to see if the next shot went in. If the shot did go in, that would **** my under bet but it would approved anyway in most cases but if it missed, the trader would very often reject the bet.

    It did not matter, I still made enough to live off for 6 months (combined with steaming tennis) as my only source of income before I started modeling. Unless a very large number of NBA games are (or back then were) being fixed, there are some crazy smart syndicates out there that likely have strike-rates of somewhere in the region of 58-62% albeit against live Totals at an unshaded sharp book. As in they were taking $1.70 odds and still managing to win which is what protected me from malicious trader(s) at the soft book.

    They may have had inside information but considering how low limits are live mid quarter, it would have made more sense to bet 5-30 minutes before the start of matches or halftime at vastly higher limits rather than wait until midway through a quarter to exploit inside information. They were not snipers either, not going to win at 58-62% by court-siding and I would not have been able to successfully steam them.

    Building a simulator is fun. Javascript with HTML output is a very pleasant experience and useful coding experience wherever the future might lead. A simulator is also very good for live betting if available.

  13. #13
    nash13
    nash13's Avatar Become A Pro!
    Join Date: 01-21-14
    Posts: 1,122
    Betpoints: 7160

    A(( 0.96 * ( ( field goals attempted ) + ( turnovers ) + 0.44 * ( free throws attempted ) - ( offensive rebounds ) ) )@1) query
    90.65 season = 1995
    89.12 season = 1996
    89.34 season = 1997
    87.94 season = 1998
    91.84 season = 1999
    90.19 season = 2000
    89.59 season = 2001
    89.98 season = 2002
    88.78 season = 2003
    89.79 season = 2004
    89.36 season = 2005
    90.62 season = 2006
    90.73 season = 2007
    90.25 season = 2008
    91.07 season = 2009
    90.59 season = 2010
    89.94 season = 2011
    90.67 season = 2012
    92.45 season = 2013
    92.63 season = 2014
    94.13 season = 2015
    94.77 season = 2016
    95.47 season = 2017
    98.45 season = 2018
    If you dig deeper into the data, you can see a pattern evolving since 2014 for the NBA. I guess the new rule changes and game philosophy contradict directly what bettors used to do back in the day. I have found a metric unbalance in the line making shift since then and it yielding 12%-17% on totals over 2500+ bets.

  14. #14
    danshan11
    I am good at coin flips, I really am!
    danshan11's Avatar Become A Pro!
    Join Date: 07-08-17
    Posts: 4,101
    Betpoints: 2888

    I had successfully modeled NBA totals last year and this year literally had to stop halfway through the season when my totals were so far off!

  15. #15
    nash13
    nash13's Avatar Become A Pro!
    Join Date: 01-21-14
    Posts: 1,122
    Betpoints: 7160

    Quote Originally Posted by danshan11 View Post
    I had successfully modeled NBA totals last year and this year literally had to stop halfway through the season when my totals were so far off!
    Based on what? I had a model which worked from 2004 to 2014 and then completely turned around. And its doing wonders right now.

  16. #16
    ChuckyTheGoat
    ChuckyTheGoat's Avatar SBR PRO
    Join Date: 04-04-11
    Posts: 31,500
    Betpoints: 24857

    Good thread. Good Luck to u guys. Very interesting.

    Consider Pace in NFL. Pace would translate to Possessions per Game. The interesting angle would be Possessions that are effectively kneel-downs at end of half. Would imagine that u have to adjust to discard those.

Top