1. #1
    bztips
    bztips's Avatar Become A Pro!
    Join Date: 06-03-10
    Posts: 283

    logit modeling

    Suppose we are trying to create a model to predict winners of baseball games; we select a binary logit specification. The probability of a team winning is specified as a function of, say, the published odds (most important!), plus any other variables that we think may help to explain the outcomes.

    Of course there are a variety of measures that could be employed to estimate goodness-of-fit from such a model. However, the main thing we’re interested in is not how well it fits per se, but rather whether the probability estimates generated from the model provide some measured edge against the available odds. If so, we bet. And of course we do out-of-sample testing to see whether the edges hold up.

    A problem arises, however, if the model really doesn’t fit very well – namely, a poor model without much explanatory power will tend to generate probability estimates near 50% for each of the two alternatives; in fact, 50/50 is the exact prediction if there is no explanatory value at all. But if we still insist on using such a poor model that doesn’t explain much, then on average it will tend to give us a supposed “edge” primarily on underdogs (since their predicted probabilities will always tend toward 50%), and much less often on favorites (since their probabilities will also tend toward 50% due simply to the weak explanatory power of the model).

    In practical terms, I use this to screen my logit models (instead of trying to directly assess the various goodness-of-fit measures): if my model is projecting large edges primarily on underdogs at the expense of favorites, then I know it probably is not a very good model.

    Thoughts?

  2. #2
    Miz
    Miz's Avatar Become A Pro!
    Join Date: 08-30-09
    Posts: 695
    Betpoints: 3162

    Are you intending to use the line as an input parameter of your model? Maybe I misunderstood, but I wouldn't think you'd want to do that.

    Aside from that...

    Have you explored evaluating your model on a filtered sample... test it versus games with lines that are closer to 50/50? Say in MLB you limited your evaluation (test set) to games lower than -130/130 vigless? With some models, they have trouble with large favorites (likley to be non-competitive contests), but can distinguish edges more accurately in more competitive contests. So, in short, does it still choose all underdogs in what are likley to be more competitive games? If not, then maybe it isn't time to toss it out the window yet.

    The line is just an indicator that contest is likely to be competitive... not a guarantee that it will turn out that way. Anyway, it can be used as a filter for a test set.
    Last edited by Miz; 01-06-13 at 09:29 PM.

  3. #3
    bztips
    bztips's Avatar Become A Pro!
    Join Date: 06-03-10
    Posts: 283

    Thanks Miz. That is one area that I haven't looked at closely enough -- subsets of games with particular characteristics. Of course doing too much of that can lead to misleading results due to data mining.

  4. #4
    Miz
    Miz's Avatar Become A Pro!
    Join Date: 08-30-09
    Posts: 695
    Betpoints: 3162

    Yep, I totally understand. My thinking on datamining is that if I have an intuitive reason that can very simply explain why a subset performed better then I consider that acceptable. If I have to use more than one reason to explain it away then I call BS on myself. In your case, if you saw an improvement, I think it could be justified. People may disagree but I do exactly this type of filtering on 2 models that I bet. I only consider what are likely to be competitive games. Game dynamics can change when one side is likely to have a big lead.

  5. #5
    jspice
    jspice's Avatar Become A Pro!
    Join Date: 02-07-13
    Posts: 2
    Betpoints: 66

    Maybe I'm missing something, but why would you ignore goodness-of-fit measures for your model? So like, normally you would fit a model with just Yi = published_odds, then sequentially add your new explanatory variables and compare -2LogL, AIC, BIC, etc. for each run of the model. I also assume you are looking at the significance of the parameter estimates when you are fitting the model to the training sample? These diagnostics taken together tell us if the extra variables are adding any significant explanatory power to the model. I suppose you could ignore them and then see how well it performs on out-of-sample tests, but you could do the same thing with a randomly-generated noise variable.

  6. #6
    bztips
    bztips's Avatar Become A Pro!
    Join Date: 06-03-10
    Posts: 283

    Welcome to the forum jspice. Sorry if I gave the wrong impression. I agree with absolutely everything you said; so yes, I'm looking at the various criterion measures, and of course the signs and significance of the parameter estimates. But I wanted to focus the discussion on the 50% phenomenon, which I think lots of people may overlook -- ie., a crappy model will tend to produce estimates near 50%.

  7. #7
    buby74
    buby74's Avatar Become A Pro!
    Join Date: 06-08-10
    Posts: 92
    Betpoints: 21207

    A fascinating question

    When you say your model tends towards 50/50 does it identify the favourite correctlyor is it truely random.

    I am having a possibly similar problem with my models which tend to identify the favourite quite reliably but almost always have them as a smaller favourite than the vegas line. So it identifies an edge on the underdog. 9although this has worked very well in the NBA this year). I think it could be that because I weight all games during a season equally I am missing the fluctuation in true team skill during the season. So a team that goes 0-5 and then 10-0 is treated the same as a team that goes 10-0 but then 0-5. Now it is true that luck can explain some of the variation like this but by ignoring true changes in team skill during the season my model is too "homogenous". I have experimetned with weighing earlier games less and trying to find which weighting gives average lines similar to vegas but there are many combinations which fit depending which formula I use to reduce weighting over time and when do i start reducing the impact of games (2 weeks , 3 weeks?) and which games do I drop entirely (10 weeks. 12 weeks). All of this will vary by sport as well.

  8. #8
    jspice
    jspice's Avatar Become A Pro!
    Join Date: 02-07-13
    Posts: 2
    Betpoints: 66

    bztips, I think I see what you are saying wrt tending toward 50/50, i.e. a logistic model Yi = Xi + e will give estimates of for P(Yi=1) = 0.5 if corr(X,Y) = 0? So if your model is bad, it will draw underdogs and favorites back to the middle or around 50% probability, and that would give us a one-directional problem of not knowing the culprit (bad model or favorites priced with too much edge). I am not sure enough about this to make an informed comment, so perhaps I will do some simulation and report back.

    Anyhow, whether or not your model is poor can really only be assessed with the available diagnostics and performance in the validation sample. You may consider looking at some pseudo R^2, although they cannot necessarily be interpreted as % of total variance explained by model like you have in OLS regression. In addition to checking meaures of fit and significance of parameters, I would also look at:

    *Classification rates on your validation sample: % correct, % false positives, % false negatives. If you calibrate the model on the training data and then it performs well in the validation sample, then I do not really see what the problem is if you would have been +EV in backtesting against the vig. You may gain some insight on specificity/sensitivity though by looking at false pos and false neg rates which could help you tweak the model.

    *Collinearity diagnostics since you are using published odds in the model. The problem with this is published odds are a black box, so you don't really know what they are including or not including. It could be that some of your "unique" variables are already accounted for and highly collinear with published odds.

  9. #9
    marcoforte
    marcoforte's Avatar Become A Pro!
    Join Date: 08-10-08
    Posts: 140
    Betpoints: 396

    If your model is showing edges on the underdogs and they are winning at a 50/50 rate then who cares how close the fit is. I use a variation of this to identify and play selected NFL underdogs on the ml. It proved profitable this season.

  10. #10
    Juret
    Update your status
    Juret's Avatar Become A Pro!
    Join Date: 07-18-10
    Posts: 113
    Betpoints: 1239

    Explanatory modeling and predictive modeling are two separate things and should therefore be processed in separate ways.

  11. #11
    Inspirited
    Inspirited's Avatar SBR PRO
    Join Date: 06-26-10
    Posts: 1,783
    Betpoints: 17828

    I have read that even non significant variables can still be used for prediction purposes. Is this correct?

  12. #12
    brettd
    brettd's Avatar Become A Pro!
    Join Date: 01-25-10
    Posts: 229
    Betpoints: 3869

    Quote Originally Posted by buby74 View Post
    A fascinating question

    When you say your model tends towards 50/50 does it identify the favourite correctlyor is it truely random.

    I am having a possibly similar problem with my models which tend to identify the favourite quite reliably but almost always have them as a smaller favourite than the vegas line. So it identifies an edge on the underdog. 9although this has worked very well in the NBA this year). I think it could be that because I weight all games during a season equally I am missing the fluctuation in true team skill during the season. So a team that goes 0-5 and then 10-0 is treated the same as a team that goes 10-0 but then 0-5. Now it is true that luck can explain some of the variation like this but by ignoring true changes in team skill during the season my model is too "homogenous". I have experimetned with weighing earlier games less and trying to find which weighting gives average lines similar to vegas but there are many combinations which fit depending which formula I use to reduce weighting over time and when do i start reducing the impact of games (2 weeks , 3 weeks?) and which games do I drop entirely (10 weeks. 12 weeks). All of this will vary by sport as well.
    Structure your model for easy optimization. If for example, you're using exponential smoothing as a weighting method, allow the parameters to be optimized against MSE (or some other metric).

  13. #13
    mathdotcom
    mathdotcom's Avatar Become A Pro!
    Join Date: 03-24-08
    Posts: 11,689
    Betpoints: 1943

    R-squared or pseudo R-squared are basically useless in these applications... ignore.

    Take NBA 2nd halves as an example. Regress the 2nd half total on the posted total for the entire game and you're going to find an R-squared around only 10% while the coefficient is highly significant. Throw in a few more features and you have a decent model -- but the R-squared still won't be much higher.

    One of my most successful derivatives models has an R-squared of 5%. This is just a feature of derivative models. If you instead model based on the fundamentals, you'll have a much higher R-squared as you're regressing on things that directly matter instead of things that are noisily correlated with those fundamentals (the line) -- that said, you're often still better off just using the line.
    Points Awarded:

    TomG gave mathdotcom 50 SBR Point(s) for this post.


Top