Small world Rich, I'm from Ontario as well. I follow edges at PLP and play Proline daily. I've never gotten into the Props because my impression is that any good edges are LLE pretty early in the morning. Correct me if I'm wrong, as if I am I may look into props more closely.
You're not intruding at all, and have actually brought up a good point on the defencemen. I hadn't considered that, but you're right, teams that have more offense from their backend may have overinflated Corsi/Fenwick. Not sure exactly what to do about that, but I'll give it some thought.
I won't get into every detail of my model, as it will be a long post just giving the basics.
The first part is calculating an expected number of goals for each team. There are two factors which contribute to the Expected Goals. I consider goals at even strength and goals on the power play. I have assumed that short-handed goals are immaterial enough that they can be ignored. Not sure if this is a valid assumption, but it's what I've done thus far.
Even strength goals are driven by Fenwick. In this post I won't get into exactly how I calculate this portion, but it's basically a combination of Fenwick Close and Fenwick Home/Road, depending on which team is at home and on the road. For the Colorado-Winnipeg game tonight I came up with a 50.1 for Colorado and a 47.6 for Winnipeg. Next I divide the expected number of shots in the game (56) up between the two teams based on their possession stats. So Colorado ends up with (50.1/(50.1+47.6))*56 = 28.72 shots. I know it seems a bit backwards that I use Fenwick to come up with number of shots. As you suggested, why not just use number of shots. As I said above, my thinking was that Fenwick better quantified the factors I wanted to work into my model. However, now I'm even questioning that myself.
Anyway, that can be debated later. PDO says that in the long run, a team's shooting % + save % should come very close to equalling 100%. Therefore, I have normalized Shooting and Save % for each team that total to 100%. For Colorado Shooting % in this game, I take their Shooting % and combine it with Winnipeg's Save %. Since I've normalized each team's PDO to be 100%, I really could just use a league average Shooting and Save %. However, in some cases it makes a cent or two difference, so I've continued to do this for now. In this game, I have Colorado's shooting % at 8.3%. So their even strength expected goals is 28.72*.083 = 2.38.
Powerplay goals is based on # of shots for and shots allowed per minute of time on powerplay/penalty kill. I factor in that the home team will have more powerplay time due to the referee's bias to call more penalties on the visiting team. The stat I have seen is that on average the home team will have 2.5 more minutes on the power play than the road team. I use a shooting percentage of 12%. In this game, for Colorado I have 0.814 shots per minute of powerplay time and I have them on the powerplay for 7.5 minutes. So the expected number of powerplay goals is 0.814*7.5*.12 = 0.733.
So Colorado's expected goals tonight is 2.38 + 0.733 = 3.11
Using similar logic I have Winnipeg's expected goals as 2.78.
I then use Poisson to figure out the probability of each team scoring each number of goals.
P(X=k) = (E(Goals)^k * e^(-E(Goals)))/(k!)
So, for example Colorado's probability of scoring exactly 2 is:
P(X=2) = (3.11^2 * 2.71828^(-3.11))/2 = 21.6%
I do that for both teams, for k = 0, 1, 2, 3, 4, 5, 6, 7. I stop at 7 since anything higher basically has a 0 probability.
Then I calculate the cumulative distribution by summing the Probabilities. So, for example, P(X<=2) = Sum(P(X=0,X=1,X=2)
This allows me to start calculating the probability of each team winning. I do this for each goal amount. So what is Colorado's probability of winning if Winnipeg scores 0 goals (disregarding a tie at this point). It's the probability of Winnipeg scoring 0 goals multiplied by the probability of Colorado scoring 1 or more goals. Or, in other words:
P(Winnipeg X=0) * (1 - P(Colorado X<=0)
What if Winnipeg scores 1 goal?
P(Winnipeg X=1) * (1 - P(Colorado X<=1)
2 goals?
P(Winnipeg X=2) * (1 - P(Colorado X<=2)
and so on.
I then sum up all those probablities for each team. At this point my sum comes to Colorado at 47% and Winnipeg at 36%. The remaining 17% comes from situations where they are expected to score the same number of goals. For simplicity I divide this amount by 2 and assign half to each team. So my final probablities would be 55.5% and 44.5%. I'm starting to think that assumption is too simplistic though. I'm basically assuming if it goes to a shootout both teams have a 50% chance. I think that assumption may not be valid. Anyway, that's what I'm doing at this point.
I then translate those percentages into a betting line. So Colorado's line should be -125. If there is a big enough difference between that line and the available line I make the bet (assuming no injuries or other information to consider). In this case, I bet Winnipeg +152.
I know my model isn't perfect and it has some oversimplifications that I will need to work out at some point. However, I do think it has aspects which allow it to find value. I just have yet to discover if that's actually true since I have never run it over the course of 1 season, let alone multiple seasons.
You're not intruding at all, and have actually brought up a good point on the defencemen. I hadn't considered that, but you're right, teams that have more offense from their backend may have overinflated Corsi/Fenwick. Not sure exactly what to do about that, but I'll give it some thought.
I won't get into every detail of my model, as it will be a long post just giving the basics.
The first part is calculating an expected number of goals for each team. There are two factors which contribute to the Expected Goals. I consider goals at even strength and goals on the power play. I have assumed that short-handed goals are immaterial enough that they can be ignored. Not sure if this is a valid assumption, but it's what I've done thus far.
Even strength goals are driven by Fenwick. In this post I won't get into exactly how I calculate this portion, but it's basically a combination of Fenwick Close and Fenwick Home/Road, depending on which team is at home and on the road. For the Colorado-Winnipeg game tonight I came up with a 50.1 for Colorado and a 47.6 for Winnipeg. Next I divide the expected number of shots in the game (56) up between the two teams based on their possession stats. So Colorado ends up with (50.1/(50.1+47.6))*56 = 28.72 shots. I know it seems a bit backwards that I use Fenwick to come up with number of shots. As you suggested, why not just use number of shots. As I said above, my thinking was that Fenwick better quantified the factors I wanted to work into my model. However, now I'm even questioning that myself.
Anyway, that can be debated later. PDO says that in the long run, a team's shooting % + save % should come very close to equalling 100%. Therefore, I have normalized Shooting and Save % for each team that total to 100%. For Colorado Shooting % in this game, I take their Shooting % and combine it with Winnipeg's Save %. Since I've normalized each team's PDO to be 100%, I really could just use a league average Shooting and Save %. However, in some cases it makes a cent or two difference, so I've continued to do this for now. In this game, I have Colorado's shooting % at 8.3%. So their even strength expected goals is 28.72*.083 = 2.38.
Powerplay goals is based on # of shots for and shots allowed per minute of time on powerplay/penalty kill. I factor in that the home team will have more powerplay time due to the referee's bias to call more penalties on the visiting team. The stat I have seen is that on average the home team will have 2.5 more minutes on the power play than the road team. I use a shooting percentage of 12%. In this game, for Colorado I have 0.814 shots per minute of powerplay time and I have them on the powerplay for 7.5 minutes. So the expected number of powerplay goals is 0.814*7.5*.12 = 0.733.
So Colorado's expected goals tonight is 2.38 + 0.733 = 3.11
Using similar logic I have Winnipeg's expected goals as 2.78.
I then use Poisson to figure out the probability of each team scoring each number of goals.
P(X=k) = (E(Goals)^k * e^(-E(Goals)))/(k!)
So, for example Colorado's probability of scoring exactly 2 is:
P(X=2) = (3.11^2 * 2.71828^(-3.11))/2 = 21.6%
I do that for both teams, for k = 0, 1, 2, 3, 4, 5, 6, 7. I stop at 7 since anything higher basically has a 0 probability.
Then I calculate the cumulative distribution by summing the Probabilities. So, for example, P(X<=2) = Sum(P(X=0,X=1,X=2)
This allows me to start calculating the probability of each team winning. I do this for each goal amount. So what is Colorado's probability of winning if Winnipeg scores 0 goals (disregarding a tie at this point). It's the probability of Winnipeg scoring 0 goals multiplied by the probability of Colorado scoring 1 or more goals. Or, in other words:
P(Winnipeg X=0) * (1 - P(Colorado X<=0)
What if Winnipeg scores 1 goal?
P(Winnipeg X=1) * (1 - P(Colorado X<=1)
2 goals?
P(Winnipeg X=2) * (1 - P(Colorado X<=2)
and so on.
I then sum up all those probablities for each team. At this point my sum comes to Colorado at 47% and Winnipeg at 36%. The remaining 17% comes from situations where they are expected to score the same number of goals. For simplicity I divide this amount by 2 and assign half to each team. So my final probablities would be 55.5% and 44.5%. I'm starting to think that assumption is too simplistic though. I'm basically assuming if it goes to a shootout both teams have a 50% chance. I think that assumption may not be valid. Anyway, that's what I'm doing at this point.
I then translate those percentages into a betting line. So Colorado's line should be -125. If there is a big enough difference between that line and the available line I make the bet (assuming no injuries or other information to consider). In this case, I bet Winnipeg +152.
I know my model isn't perfect and it has some oversimplifications that I will need to work out at some point. However, I do think it has aspects which allow it to find value. I just have yet to discover if that's actually true since I have never run it over the course of 1 season, let alone multiple seasons.