Does low R2 negate good results?

**mathdotcom** · 12-06-11, 07:07 PM

This is very common. Game results are very noisy.

**mathdotcom** · 12-06-11, 07:12 PM

Take a market like basketball. If you collect even a 2000-3000 game database and then graph total points scored in a game vs. the market total, you're going to see a big oval blob that has a positive slope. Imagine a regression line going through it. Now imagine how large is the sum of squared errors. That's why the low R-squared.

**Justin7** · 12-06-11, 09:48 PM

Compare your R2 vs the closing lines' R2. If you get pretty close to theirs, you have something good. If your R2 is smaller over a huge sample, you will destroy the sport.

**Waterstpub87** · 12-06-11, 10:21 PM

Originally posted by Justin7

Compare your R2 vs the closing lines' R2. If you get pretty close to theirs, you have something good. If your R2 is smaller over a huge sample, you will destroy the sport.

Thank you for the help. I am a little confused on this. I was under the impression that I want my R2 to be a bigger number. As in the variance in total is better explained by my model. Would I want it to be smaller then the line vs total? I did run a regression on the line vs the total, ending up with an r2 of less then .001. I am going to re run them in case I made a mistake. but that was the impression I was under

**Waterstpub87** · 12-06-11, 10:27 PM

I recopied the data and ran a new regression. This time, my model came up with R2 .1783 and the line to total came up with r2 .2192.

**MonkeyF0cker** · 12-06-11, 10:55 PM

Originally posted by Waterstpub87

Thank you for the help. I am a little confused on this. I was under the impression that I want my R2 to be a bigger number. As in the variance in total is better explained by my model. Would I want it to be smaller then the line vs total? I did run a regression on the line vs the total, ending up with an r2 of less then .001. I am going to re run them in case I made a mistake. but that was the impression I was under

Yes. You want a higher R2. Not lower.

**mathdotcom** · 12-06-11, 11:10 PM

Originally posted by Justin7

Compare your R2 vs the closing lines' R2. If you get pretty close to theirs, you have something good. If your R2 is smaller over a huge sample, you will destroy the sport.

**Waterstpub87** · 12-07-11, 11:47 AM

Justin7, how close is pretty close. My model was .4 or so off of theirs. Do you think I might have something, albeit with a low sample size?

**Justin7** · 12-07-11, 12:17 PM

Let's make sure we are talking about the same thing, first.

One measure of how good a model is to look at every game. Delta = spread result vs spread prediction (or total result vs total prediction). If for all games, you take the sum of delta ^2, determine the average, and take the square root, that is what I was calling r2 error. I'm pretty sure everyone else is using r2 to describe something else.

If your error (as I described) is on average, equal to or less than that of the market spreads, you will win. If you get very close (i.e. 11 vs 10.5), you will probably have subsets of your model that do very well.

**Waterstpub87** · 12-07-11, 01:17 PM

Originally posted by Justin7

Let's make sure we are talking about the same thing, first. One measure of how good a model is to look at every game. Delta = spread result vs spread prediction (or total result vs total prediction). If for all games, you take the sum of delta ^2, determine the average, and take the square root, that is what I was calling r2 error. I'm pretty sure everyone else is using r2 to describe something else. If your error (as I described) is on average, equal to or less than that of the market spreads, you will win. If you get very close (i.e. 11 vs 10.5), you will probably have subsets of your model that do very well.

Justin7,
I took the absolute value of the difference between predict vs actual, and line vs actual. I squared all the results, averaged them, then took the square root. I ended up with model 12.28, and line 11.88. So I pretty close. It works out towards what you said in the second part, because the 67.5% figure comes from the bets I made when my prediction was 5 past the line either way. So am I possible on to something here?

**Justin7** · 12-07-11, 01:44 PM

Originally posted by Waterstpub87

Justin7,
I took the absolute value of the difference between predict vs actual, and line vs actual. I squared all the results, averaged them, then took the square root. I ended up with model 12.28, and line 11.88. So I pretty close. It works out towards what you said in the second part, because the 67.5% figure comes from the bets I made when my prediction was 5 past the line either way. So am I possible on to something here?

That is close enough that there might be some good subsets.

The next thing I would do: backtest your projections for the last 10 years vs both openers and closers. Try 3 different cuttofs: 3, 5 and 7 points. What results do you get vs openers? And closers?

**mathdotcom** · 12-07-11, 01:52 PM

You guys are talking about mean squared error, not R-squared...

**AlwaysDrawing** · 12-07-11, 02:42 PM

Originally posted by Waterstpub87

Justin7,
I took the absolute value of the difference between predict vs actual, and line vs actual. I squared all the results, averaged them, then took the square root. I ended up with model 12.28, and line 11.88. So I pretty close. It works out towards what you said in the second part, because the 67.5% figure comes from the bets I made when my prediction was 5 past the line either way. So am I possible on to something here?

if you're squaring it, you don't need to take the absolute value.

Also, mdc is right, j7 is talking about MSE (though MSE is inversely related to R^2).

So, you want a lower MSE than closers, but a higher r^2.

**AlwaysDrawing** · 12-07-11, 02:45 PM

That said, you should be using root mean squared error, because that is intuitively what people think of (the absolute difference between actual/predicted), not the square of the difference.

EDIT: OP may have been doing that. j7 is being unclear.

**AlwaysDrawing** · 12-07-11, 02:56 PM

Originally posted by Waterstpub87

Justin7,
I took the absolute value of the difference between predict vs actual, and line vs actual. I squared all the results, averaged them, then took the square root. I ended up with model 12.28, and line 11.88. So I pretty close. It works out towards what you said in the second part, because the 67.5% figure comes from the bets I made when my prediction was 5 past the line either way. So am I possible on to something here?

OP:

Remember, the width of the confidence interval is proportional to the RMSE, but with a difference so small (just a couple %s), I would be wary of adding additional parameters to your model to try to reduce the RMSE further, unless you had something you felt strongly might influence your model from an intuitive standpoint.

Back test it where your line differs in tranches like j7 says, and see if your model or the market was better. Remember not to backtest it on data you used to create the model.

**Waterstpub87** · 12-07-11, 02:58 PM

Originally posted by AlwaysDrawing

if you're squaring it, you don't need to take the absolute value. Also, mdc is right, j7 is talking about MSE (though MSE is inversely related to R^2). So, you want a lower MSE than closers, but a higher r^2.

AD,
Your right on that, however I just wanted to column there. I reran a regression with only the ones past 5 points, which is what ive been betting. I ended up with r2 of .1994 for the model, and an r2 of .1496 for the line. This sort of ties into the fact that It has been doing great past 5 points. I'm going to back test it for the 10 years when I get a chance, but is this promising thus far?

**AlwaysDrawing** · 12-07-11, 03:09 PM

Originally posted by Waterstpub87

AD,
Your right on that, however I just wanted to column there. I reran a regression with only the ones past 5 points, which is what ive been betting. I ended up with r2 of .1994 for the model, and an r2 of .1496 for the line. This sort of ties into the fact that It has been doing great past 5 points. I'm going to back test it for the 10 years when I get a chance, but is this promising thus far?

I would say so. Sounds like a solid model. Backtesting it is of course one way to make sure, though future testing (ie, betting it yourself) works too. Higher risk, but you'll definitely figure out if it's profitable.

If you decide you want someone to take a closer look, let me know.