Does low R2 negate good results?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Waterstpub87
    SBR MVP
    • 09-09-09
    • 4102

    #1
    Does low R2 negate good results?
    I have been working on a model to predict NFL totals. Currently, it 27/40, hitting 67.5%. Today, I regressed the predicted total against the actual total and came up with a r2 of 17.83. Note that this wasn't against over vs. under, this was just against the different points. There have been many wins where I predicted 30 and it ended up 10, or 60 and it ended up 80. Should I be worried about long term viability because of the Low r2. Both P value and Sig F were both very low.
  • mathdotcom
    SBR Posting Legend
    • 03-24-08
    • 11689

    #2
    This is very common. Game results are very noisy.
    Comment
    • mathdotcom
      SBR Posting Legend
      • 03-24-08
      • 11689

      #3
      Take a market like basketball. If you collect even a 2000-3000 game database and then graph total points scored in a game vs. the market total, you're going to see a big oval blob that has a positive slope. Imagine a regression line going through it. Now imagine how large is the sum of squared errors. That's why the low R-squared.
      Comment
      • Justin7
        SBR Hall of Famer
        • 07-31-06
        • 8577

        #4
        Compare your R2 vs the closing lines' R2. If you get pretty close to theirs, you have something good. If your R2 is smaller over a huge sample, you will destroy the sport.
        Comment
        • Waterstpub87
          SBR MVP
          • 09-09-09
          • 4102

          #5
          Originally posted by Justin7
          Compare your R2 vs the closing lines' R2. If you get pretty close to theirs, you have something good. If your R2 is smaller over a huge sample, you will destroy the sport.
          Thank you for the help. I am a little confused on this. I was under the impression that I want my R2 to be a bigger number. As in the variance in total is better explained by my model. Would I want it to be smaller then the line vs total? I did run a regression on the line vs the total, ending up with an r2 of less then .001. I am going to re run them in case I made a mistake. but that was the impression I was under
          Comment
          • Waterstpub87
            SBR MVP
            • 09-09-09
            • 4102

            #6
            I recopied the data and ran a new regression. This time, my model came up with R2 .1783 and the line to total came up with r2 .2192.
            Comment
            • MonkeyF0cker
              SBR Posting Legend
              • 06-12-07
              • 12144

              #7
              Originally posted by Waterstpub87
              Thank you for the help. I am a little confused on this. I was under the impression that I want my R2 to be a bigger number. As in the variance in total is better explained by my model. Would I want it to be smaller then the line vs total? I did run a regression on the line vs the total, ending up with an r2 of less then .001. I am going to re run them in case I made a mistake. but that was the impression I was under
              Yes. You want a higher R2. Not lower.
              Comment
              • mathdotcom
                SBR Posting Legend
                • 03-24-08
                • 11689

                #8
                Originally posted by Justin7
                Compare your R2 vs the closing lines' R2. If you get pretty close to theirs, you have something good. If your R2 is smaller over a huge sample, you will destroy the sport.
                Comment
                • Waterstpub87
                  SBR MVP
                  • 09-09-09
                  • 4102

                  #9
                  Justin7, how close is pretty close. My model was .4 or so off of theirs. Do you think I might have something, albeit with a low sample size?
                  Comment
                  • Justin7
                    SBR Hall of Famer
                    • 07-31-06
                    • 8577

                    #10
                    Let's make sure we are talking about the same thing, first.

                    One measure of how good a model is to look at every game. Delta = spread result vs spread prediction (or total result vs total prediction). If for all games, you take the sum of delta ^2, determine the average, and take the square root, that is what I was calling r2 error. I'm pretty sure everyone else is using r2 to describe something else.

                    If your error (as I described) is on average, equal to or less than that of the market spreads, you will win. If you get very close (i.e. 11 vs 10.5), you will probably have subsets of your model that do very well.
                    Comment
                    • Waterstpub87
                      SBR MVP
                      • 09-09-09
                      • 4102

                      #11
                      Originally posted by Justin7
                      Let's make sure we are talking about the same thing, first. One measure of how good a model is to look at every game. Delta = spread result vs spread prediction (or total result vs total prediction). If for all games, you take the sum of delta ^2, determine the average, and take the square root, that is what I was calling r2 error. I'm pretty sure everyone else is using r2 to describe something else. If your error (as I described) is on average, equal to or less than that of the market spreads, you will win. If you get very close (i.e. 11 vs 10.5), you will probably have subsets of your model that do very well.
                      Justin7,
                      I took the absolute value of the difference between predict vs actual, and line vs actual. I squared all the results, averaged them, then took the square root. I ended up with model 12.28, and line 11.88. So I pretty close. It works out towards what you said in the second part, because the 67.5% figure comes from the bets I made when my prediction was 5 past the line either way. So am I possible on to something here?
                      Comment
                      • Justin7
                        SBR Hall of Famer
                        • 07-31-06
                        • 8577

                        #12
                        Originally posted by Waterstpub87
                        Justin7,
                        I took the absolute value of the difference between predict vs actual, and line vs actual. I squared all the results, averaged them, then took the square root. I ended up with model 12.28, and line 11.88. So I pretty close. It works out towards what you said in the second part, because the 67.5% figure comes from the bets I made when my prediction was 5 past the line either way. So am I possible on to something here?
                        That is close enough that there might be some good subsets.

                        The next thing I would do: backtest your projections for the last 10 years vs both openers and closers. Try 3 different cuttofs: 3, 5 and 7 points. What results do you get vs openers? And closers?
                        Comment
                        • mathdotcom
                          SBR Posting Legend
                          • 03-24-08
                          • 11689

                          #13
                          You guys are talking about mean squared error, not R-squared...
                          Comment
                          • AlwaysDrawing
                            SBR Wise Guy
                            • 11-20-09
                            • 657

                            #14
                            Originally posted by Waterstpub87
                            Justin7,
                            I took the absolute value of the difference between predict vs actual, and line vs actual. I squared all the results, averaged them, then took the square root. I ended up with model 12.28, and line 11.88. So I pretty close. It works out towards what you said in the second part, because the 67.5% figure comes from the bets I made when my prediction was 5 past the line either way. So am I possible on to something here?
                            if you're squaring it, you don't need to take the absolute value.

                            Also, mdc is right, j7 is talking about MSE (though MSE is inversely related to R^2).

                            So, you want a lower MSE than closers, but a higher r^2.
                            Comment
                            • AlwaysDrawing
                              SBR Wise Guy
                              • 11-20-09
                              • 657

                              #15
                              That said, you should be using root mean squared error, because that is intuitively what people think of (the absolute difference between actual/predicted), not the square of the difference.


                              EDIT: OP may have been doing that. j7 is being unclear.
                              Last edited by AlwaysDrawing; 12-07-11, 02:47 PM. Reason: OP may have been using root MSE
                              Comment
                              • AlwaysDrawing
                                SBR Wise Guy
                                • 11-20-09
                                • 657

                                #16
                                Originally posted by Waterstpub87
                                Justin7,
                                I took the absolute value of the difference between predict vs actual, and line vs actual. I squared all the results, averaged them, then took the square root. I ended up with model 12.28, and line 11.88. So I pretty close. It works out towards what you said in the second part, because the 67.5% figure comes from the bets I made when my prediction was 5 past the line either way. So am I possible on to something here?
                                OP:

                                Remember, the width of the confidence interval is proportional to the RMSE, but with a difference so small (just a couple %s), I would be wary of adding additional parameters to your model to try to reduce the RMSE further, unless you had something you felt strongly might influence your model from an intuitive standpoint.

                                Back test it where your line differs in tranches like j7 says, and see if your model or the market was better. Remember not to backtest it on data you used to create the model.
                                Comment
                                • Waterstpub87
                                  SBR MVP
                                  • 09-09-09
                                  • 4102

                                  #17
                                  Originally posted by AlwaysDrawing
                                  if you're squaring it, you don't need to take the absolute value. Also, mdc is right, j7 is talking about MSE (though MSE is inversely related to R^2). So, you want a lower MSE than closers, but a higher r^2.
                                  AD,
                                  Your right on that, however I just wanted to column there. I reran a regression with only the ones past 5 points, which is what ive been betting. I ended up with r2 of .1994 for the model, and an r2 of .1496 for the line. This sort of ties into the fact that It has been doing great past 5 points. I'm going to back test it for the 10 years when I get a chance, but is this promising thus far?
                                  Comment
                                  • AlwaysDrawing
                                    SBR Wise Guy
                                    • 11-20-09
                                    • 657

                                    #18
                                    Originally posted by Waterstpub87
                                    AD,
                                    Your right on that, however I just wanted to column there. I reran a regression with only the ones past 5 points, which is what ive been betting. I ended up with r2 of .1994 for the model, and an r2 of .1496 for the line. This sort of ties into the fact that It has been doing great past 5 points. I'm going to back test it for the 10 years when I get a chance, but is this promising thus far?
                                    I would say so. Sounds like a solid model. Backtesting it is of course one way to make sure, though future testing (ie, betting it yourself) works too. Higher risk, but you'll definitely figure out if it's profitable.

                                    If you decide you want someone to take a closer look, let me know.
                                    Comment
                                    SBR Contests
                                    Collapse
                                    Top-Rated US Sportsbooks
                                    Collapse
                                    Working...