2012 vs. 2013 Miami Dolphins: A Statistical Comparison

Shouright · Dec 30, 2013

The following analysis is based on this page, outlined as follows:

CORRELATION SUMMARY

So far we’ve analyzed each phase of the game and its statistical connection with regular season team wins. Below is a table that lists the relevant statistics and their correlations. The table is sorted in order of absolute strength of correlation.

Stat
Win Correlation

Off Pass Yds/Att
0.61

Def Pass Yds/Att
-0.47

Off Fumble Rate
-0.46

Off Int Rate
-0.45

Def FFumble Rate
-0.41

Def Int Rate
0.39

Off Pen Rate
-0.37

Off Run Yds/Att
0.18

Def Run Yds/Att
-0.04

The table is presented graphically below. Negative coefficients, such as defense pass efficiency, are shown as positive values to make it easier to compare each variable's relative importance.

The relative importance of each aspect of the game begins to come into focus. Passing is most important, followed by turnovers, then penalties and running. For every aspect, the correlation on the offensive side of the ball is stronger than on the defensive side.

But this isn’t the final word. Correlation coefficients by themselves do not take into account the other factors. In other words, they ignore the effect of the other stats when calculating the correlation.

REGRESSION

To take all facets of the game into account simultaneously and produce a valid model of winning NFL games, we can use linear regression to estimate coefficients for each stat. The relative value of the coefficients will reveal the relative importance of each phase of the game, holding all other variables constant. This will yield estimates that are more pure and accurate than simple correlations.

The dependent variable of the regression model is regular season wins. The independent variables are the efficiency stats I’ve previously outlined. The data set continues to be all 32 teams over the past 5 seasons for a total of 160 observations. The results of the regression are detailed below.

VARIABLE
COEFFICIENT

const
5.31

O Pass Eff
1.43

D Pass Eff
-1.65

O Int Rate
-53.50

D Int Rate
81.70

O Fum Rate
-49.10

D FF Rate
70.90

O Run Eff
1.00

D Run Eff
-0.55

Pen Rate
-2.73

R-squared
0.802

Each of the independent variables are statistically significant at the 0.05 level or better, except defensive run efficiency, which is significant at 0.06. The R-squared value indicates an extremely good overall fit for the model. 80% of the variance in team wins can be explained by the included variables. The remaining 20% could be due to any number of factors, but we have to accept that outcomes in any sport are partly due to luck.

Using the regression results we can estimate a team’s expected wins using a linear equation. Here is what the equation would look like:

Wins = 5.31 + (1.43 * O Pass Eff) + (- 1.65 * D Pass Eff) + …

The regression coefficients are stated in terms of wins per unit of the variable. For example, the coefficient for offensive pass efficiency (yds/att) is 1.43. So for every 1 yard improvement in pass efficiency a team can expect 1.43 additional wins. When coefficients are stated this way, it makes it very easy to estimate the effect on the dependent variable (wins) given a change in one of the independent variables. But it makes it very difficult to get a sense of the relative importance of each variable. Defensive forced fumble rates are certainly not 70 times more important than offensive run efficiency.

To reveal the true relative importance of each factor, we need to standardize each variable by calculating the number of standard deviations from its average value. In statistics, these are known as “normalized” or "standardized" variables, noted by the prefix “z.”

Here are the regression results again, this time calculated with standardized coefficients. The significance of each variable, and the overall fit of the model remain the same since only the units of the variables have changed.

VARIABLE
COEFFICIENT

constant
8.06

Z O Pass Eff
1.14

Z D Pass Eff
-0.92

Z O Int Rate
-0.45

Z D Int Rate
0.76

Z O Fum Rate
-0.33

Z D FF Rate
0.42

Z O Run Eff
0.46

Z D Run Eff
-0.24

Z Pen Rate
-0.39

It may seem like we’ve gone through a tortured process to arrive at these coefficients. But they are merely the mathematical weight we would need to give each factor to have the best estimate of actual team wins. These are based on real-world data from every team’s season between 2002 and 2006.

Here is a graph representing each variable’s relative weight. Negative coefficients, are shown as positive values.

Probably the simplest way to interpret the chart is this way. If my team is average in absolutely everything, I'd expect to win 8 games. But if my team is average in everything except offensive pass efficiency, in which we're one standard deviation above average, I'd expect to win 9.14 games (8 + 1*1.14).

So if my team was the league's best at running the ball, say 2.5 standard deviations above average, but average at everything else, we'd expect to win 9.15 games (8 + 2.5*0.46). Compare that to passing--if my team were average at everything but best in the league in passing, we'd expect to win 10.85 games (8+2.5*1.14).

Using the model above, we get the following data for the 2012 Miami Dolphins:

	OFF NET YPA	DEF NET YPA	OFF INT RATE	DEF INT RATE	OFF FUM RATE	DEF FF RATE	OFF YPC	DEF YPC	PENALTY RATE
2012 SEASON	5.8817	6.19	2.5794	1.6667	3.2468	0.014	4.0955	4.0232	0.3598

LEAGUE AVG	6.27	6.27	2.7	2.7	2.506729	0.0147	4.15	4.16	0.416
LEAGUE SD	0.72	0.65	0.94	0.81	0.629049	0.004144	0.46	0.4	0.0759
DOLPHINS Z	-0.54	-0.12	-0.13	-1.28	1.18	-0.17	-0.12	-0.34	-0.74
COEFFICIENT	1.14	-0.92	-0.45	0.76	-0.33	0.42	0.46	-0.24	-0.39
WEIGHTED # WINS	-0.61	0.11	0.06	-0.97	-0.39	-0.07	-0.05	0.08	0.29
ESTIMATED WINS	6.50

We get the following data for the 2013 Dolphins:

	OFF NET YPA	DEF NET YPA	OFF INT RATE	DEF INT RATE	OFF FUM RATE	DEF FF RATE	OFF YPC	DEF YPC	PENALTY RATE
2013 SEASON	5.4601	6.0419	3.1987	3.1034	2.0942	0.0081	4.1261	4.1281	0.2838

LEAGUE AVG	6.27	6.27	2.7	2.7	2.506729	0.0147	4.15	4.16	0.416
LEAGUE SD	0.72	0.65	0.94	0.81	0.629049	0.004144	0.46	0.4	0.0759
DOLPHINS Z	-1.12	-0.35	0.53	0.50	-0.66	-1.59	-0.05	-0.08	-1.74
COEFFICIENT	1.14	-0.92	-0.45	0.76	-0.33	0.42	0.46	-0.24	-0.39
WEIGHTED # WINS	-1.28	0.32	-0.24	0.38	0.22	-0.67	-0.02	0.02	0.68
ESTIMATED WINS	7.46

So we see an improvement in estimated wins of almost exactly one game, which matches precisely the difference in the team's record between this year and last. We could also say both teams "overachieved" a bit, based on their winning a half game more than expected in the upward direction (i.e., 7 and 8 wins), rather than winning a half game less than expected in the downward direction (i.e., 6 and 7 wins). This is likely attributable to random and/or intangible factors that may or may not be measurable quantitatively.

What's also clear is that the one-game improvement in 2013 was attributable to the 2013 team's better pass defense (lower defensive net YPA and greater defensive interception rate in 2013), and the 2013 team's greater discipline (lower penalty yards per play in 2013), as well as the 2013 team's better ball protection on offense (lower fumble rate).

roy_miami · Dec 30, 2013

I would hope to get more out of a coach than just "meeting expectations" but I would say bully gate cost us one game so I would say he overachieved by at least a full game this season. I would give him one more year to further implement his system and if we don't see a major improvement kick him (and tannehill with him) to the curb.

Shouright · Dec 30, 2013

roy_miami said:
I would hope to get more out of a coach than just "meeting expectations" but I would say bully gate cost us one game so I would say he overachieved by at least a full game this season. I would give him one more year to further implement his system and if we don't see a major improvement kick him (and tannehill with him) to the curb.

The disturbing thing for me is the regression in the passing offense input into this equation, which is the most heavily weighted one, and which was driven downward in large part by the number of sacks the team took.

WVDolphan · Dec 31, 2013

Nice post.

The numbers dont lie here.

We were a much better secondary thanks mainly to Grimes.

Our offense is terrible.

Sons Of Shula · Dec 31, 2013

Did the Matrix vomit?

Awsi Dooger · Dec 31, 2013

Our YPPA Differential was weak all year. Tannehill needs to find 7.4+ yards per attempt next season.

This season we made plays in the secondary when they were available, yet it still didn't translate to a playoff berth. It's a virtual certainty we'll decline in that area next season.

Shouright · Dec 31, 2013

Awsi Dooger said:
Our YPPA Differential was weak all year. Tannehill needs to find 7.4+ yards per attempt next season.

This season we made plays in the secondary when they were available, yet it still didn't translate to a playoff berth. It's a virtual certainty we'll decline in that area next season.

I haven't studied it (yet), but I suspect the deep ball inaccuracy is largely responsible for the YPA inadequacy at this point. The decline in net YPA on the other hand was due primarily to the increase in sacks.

FinfanInBuffalo · Jan 5, 2014

shouright said:
The disturbing thing for me is the regression in the passing offense input into this equation, which is the most heavily weighted one, and which was driven downward in large part by the number of sacks the team took.

But I thought you had objective evidence that sacks have little to no impact..... :ponder:

---------- Post added at 09:12 PM ---------- Previous post was at 09:10 PM ----------

shouright said:
I haven't studied it (yet), but I suspect the deep ball inaccuracy is largely responsible for the YPA inadequacy at this point. The decline in net YPA on the other hand was due primarily to the increase in sacks.

WTF? Do you have split personality disorder?

Shouright · Jan 5, 2014

FinfanInBuffalo said:
But I thought you had objective evidence that sacks have little to no impact.....

Do you know the difference between YPA and net YPA?

royalshank · Jan 6, 2014

Great work and great post. Curious if the outcome variable was binary (win/lose) and if so, did you use OLS or logistic regression? The latter might provide more accurate coefficients though unlikely to change the directional nature of the model. Also, are you compiling data for the time period post 2006? Curious as the game continues to evolve / change though in the case of our fins it appears it held up well for the past 2 seasons. Really nice stuff!

JBinSD · Jan 6, 2014

Do you get extra credit in college or something for all this work?

Quite the statistician indeed. Sheesh.

GreenMts · Jan 6, 2014

Nice work!

Maybe you should send a resume into Jeffy or better yet Ross. :lol:

You never know ...

dlockz · Jan 6, 2014

shouright said:
The disturbing thing for me is the regression in the passing offense input into this equation, which is the most heavily weighted one, and which was driven downward in large part by the number of sacks the team took.

what about the major regression of the run defense

Shouright · Jan 6, 2014

dlockz said:
what about the major regression of the run defense

In terms of the variables that are most strongly correlated with winning, that doesn't show up in the data.

---------- Post added at 01:51 AM ---------- Previous post was at 01:49 AM ----------

JBinSD said:
Do you get extra credit in college or something for all this work?

Quite the statistician indeed. Sheesh.

No, but I get to have a little more insight about the team than I would otherwise, so I'm not blindsided later by expecting something from it that I probably shouldn't have.

Shouright · Jan 6, 2014

royalshank said:
Great work and great post. Curious if the outcome variable was binary (win/lose) and if so, did you use OLS or logistic regression? The latter might provide more accurate coefficients though unlikely to change the directional nature of the model. Also, are you compiling data for the time period post 2006? Curious as the game continues to evolve / change though in the case of our fins it appears it held up well for the past 2 seasons. Really nice stuff!

No the dependent variable is win percentage, which is continuous.

2012 vs. 2013 Miami Dolphins: A Statistical Comparison

More options

Shouright

roy_miami

2020 cant get here soon enough

Shouright

WVDolphan

Sons Of Shula

not a dull boy

Awsi Dooger

Shouright

FinfanInBuffalo

Perennial All-Pro

Shouright

royalshank

Not a Game-Changer

JBinSD

Active Roster

GreenMts

Seasoned Veteran

dlockz

Hall Of Famer

Shouright

Shouright

Stat	Win Correlation
Off Pass Yds/Att	0.61
Def Pass Yds/Att	-0.47
Off Fumble Rate	-0.46
Off Int Rate	-0.45
Def FFumble Rate	-0.41
Def Int Rate	0.39
Off Pen Rate	-0.37
Off Run Yds/Att	0.18
Def Run Yds/Att	-0.04

VARIABLE	COEFFICIENT
const	5.31
O Pass Eff	1.43
D Pass Eff	-1.65
O Int Rate	-53.50
D Int Rate	81.70
O Fum Rate	-49.10
D FF Rate	70.90
O Run Eff	1.00
D Run Eff	-0.55
Pen Rate	-2.73
R-squared	0.802

VARIABLE	COEFFICIENT
constant	8.06
Z O Pass Eff	1.14
Z D Pass Eff	-0.92
Z O Int Rate	-0.45
Z D Int Rate	0.76
Z O Fum Rate	-0.33
Z D FF Rate	0.42
Z O Run Eff	0.46
Z D Run Eff	-0.24
Z Pen Rate	-0.39