2012 vs. 2013 Miami Dolphins: A Statistical Comparison | FinHeaven - Miami Dolphins Forums

2012 vs. 2013 Miami Dolphins: A Statistical Comparison

Shouright

☠️ Banned ☠️
Joined
May 18, 2004
Messages
15,051
Reaction score
18
Age
52
The following analysis is based on this page, outlined as follows:

CORRELATION SUMMARY

So far we’ve analyzed each phase of the game and its statistical connection with regular season team wins. Below is a table that lists the relevant statistics and their correlations. The table is sorted in order of absolute strength of correlation.

Stat
Win Correlation
Off Pass Yds/Att
0.61
Def Pass Yds/Att
-0.47
Off Fumble Rate
-0.46
Off Int Rate
-0.45
Def FFumble Rate
-0.41
Def Int Rate
0.39
Off Pen Rate
-0.37
Off Run Yds/Att
0.18
Def Run Yds/Att
-0.04


The table is presented graphically below. Negative coefficients, such as defense pass efficiency, are shown as positive values to make it easier to compare each variable's relative importance.

table1PNG-1.png


The relative importance of each aspect of the game begins to come into focus. Passing is most important, followed by turnovers, then penalties and running. For every aspect, the correlation on the offensive side of the ball is stronger than on the defensive side.

But this isn’t the final word. Correlation coefficients by themselves do not take into account the other factors. In other words, they ignore the effect of the other stats when calculating the correlation.

REGRESSION

To take all facets of the game into account simultaneously and produce a valid model of winning NFL games, we can use linear regression to estimate coefficients for each stat. The relative value of the coefficients will reveal the relative importance of each phase of the game, holding all other variables constant. This will yield estimates that are more pure and accurate than simple correlations.

The dependent variable of the regression model is regular season wins. The independent variables are the efficiency stats I’ve previously outlined. The data set continues to be all 32 teams over the past 5 seasons for a total of 160 observations. The results of the regression are detailed below.


VARIABLE
COEFFICIENT
const
5.31
O Pass Eff
1.43
D Pass Eff
-1.65
O Int Rate
-53.50
D Int Rate
81.70
O Fum Rate
-49.10
D FF Rate
70.90
O Run Eff
1.00
D Run Eff
-0.55
Pen Rate
-2.73
R-squared
0.802


Each of the independent variables are statistically significant at the 0.05 level or better, except defensive run efficiency, which is significant at 0.06. The R-squared value indicates an extremely good overall fit for the model. 80% of the variance in team wins can be explained by the included variables. The remaining 20% could be due to any number of factors, but we have to accept that outcomes in any sport are partly due to luck.

Using the regression results we can estimate a team’s expected wins using a linear equation. Here is what the equation would look like:

Wins = 5.31 + (1.43 * O Pass Eff) + (- 1.65 * D Pass Eff) + …

The regression coefficients are stated in terms of wins per unit of the variable. For example, the coefficient for offensive pass efficiency (yds/att) is 1.43. So for every 1 yard improvement in pass efficiency a team can expect 1.43 additional wins. When coefficients are stated this way, it makes it very easy to estimate the effect on the dependent variable (wins) given a change in one of the independent variables. But it makes it very difficult to get a sense of the relative importance of each variable. Defensive forced fumble rates are certainly not 70 times more important than offensive run efficiency.

To reveal the true relative importance of each factor, we need to standardize each variable by calculating the number of standard deviations from its average value. In statistics, these are known as “normalized” or "standardized" variables, noted by the prefix “z.”

Here are the regression results again, this time calculated with standardized coefficients. The significance of each variable, and the overall fit of the model remain the same since only the units of the variables have changed.

VARIABLE
COEFFICIENT
constant
8.06
Z O Pass Eff
1.14
Z D Pass Eff
-0.92
Z O Int Rate
-0.45
Z D Int Rate
0.76
Z O Fum Rate
-0.33
Z D FF Rate
0.42
Z O Run Eff
0.46
Z D Run Eff
-0.24
Z Pen Rate
-0.39


It may seem like we’ve gone through a tortured process to arrive at these coefficients. But they are merely the mathematical weight we would need to give each factor to have the best estimate of actual team wins. These are based on real-world data from every team’s season between 2002 and 2006.

Here is a graph representing each variable’s relative weight. Negative coefficients, are shown as positive values.


table2PNG-1.png


Probably the simplest way to interpret the chart is this way. If my team is average in absolutely everything, I'd expect to win 8 games. But if my team is average in everything except offensive pass efficiency, in which we're one standard deviation above average, I'd expect to win 9.14 games (8 + 1*1.14).

So if my team was the league's best at running the ball, say 2.5 standard deviations above average, but average at everything else, we'd expect to win 9.15 games (8 + 2.5*0.46). Compare that to passing--if my team were average at everything but best in the league in passing, we'd expect to win 10.85 games (8+2.5*1.14).

Using the model above, we get the following data for the 2012 Miami Dolphins:

OFF NET YPA
DEF NET YPA
OFF INT RATE
DEF INT RATE
OFF FUM RATE
DEF FF RATE
OFF YPC
DEF YPC
PENALTY RATE
2012 SEASON
5.8817
6.19
2.5794
1.6667
3.2468
0.014
4.0955
4.0232
0.3598
LEAGUE AVG
6.27
6.27
2.7
2.7
2.506729
0.0147
4.15
4.16
0.416
LEAGUE SD
0.72
0.65
0.94
0.81
0.629049
0.004144
0.46
0.4
0.0759
DOLPHINS Z
-0.54
-0.12
-0.13
-1.28
1.18
-0.17
-0.12
-0.34
-0.74
COEFFICIENT
1.14
-0.92
-0.45
0.76
-0.33
0.42
0.46
-0.24
-0.39
WEIGHTED # WINS
-0.61
0.11
0.06
-0.97
-0.39
-0.07
-0.05
0.08
0.29
ESTIMATED WINS
6.50

We get the following data for the 2013 Dolphins:

OFF NET YPA
DEF NET YPA
OFF INT RATE
DEF INT RATE
OFF FUM RATE
DEF FF RATE
OFF YPC
DEF YPC
PENALTY RATE
2013 SEASON
5.4601
6.0419
3.1987
3.1034
2.0942
0.0081
4.1261
4.1281
0.2838
LEAGUE AVG
6.27
6.27
2.7
2.7
2.506729
0.0147
4.15
4.16
0.416
LEAGUE SD
0.72
0.65
0.94
0.81
0.629049
0.004144
0.46
0.4
0.0759
DOLPHINS Z
-1.12
-0.35
0.53
0.50
-0.66
-1.59
-0.05
-0.08
-1.74
COEFFICIENT
1.14
-0.92
-0.45
0.76
-0.33
0.42
0.46
-0.24
-0.39
WEIGHTED # WINS
-1.28
0.32
-0.24
0.38
0.22
-0.67
-0.02
0.02
0.68
ESTIMATED WINS
7.46

So we see an improvement in estimated wins of almost exactly one game, which matches precisely the difference in the team's record between this year and last. We could also say both teams "overachieved" a bit, based on their winning a half game more than expected in the upward direction (i.e., 7 and 8 wins), rather than winning a half game less than expected in the downward direction (i.e., 6 and 7 wins). This is likely attributable to random and/or intangible factors that may or may not be measurable quantitatively.

What's also clear is that the one-game improvement in 2013 was attributable to the 2013 team's better pass defense (lower defensive net YPA and greater defensive interception rate in 2013), and the 2013 team's greater discipline (lower penalty yards per play in 2013), as well as the 2013 team's better ball protection on offense (lower fumble rate).
 
I would hope to get more out of a coach than just "meeting expectations" but I would say bully gate cost us one game so I would say he overachieved by at least a full game this season. I would give him one more year to further implement his system and if we don't see a major improvement kick him (and tannehill with him) to the curb.
 
I would hope to get more out of a coach than just "meeting expectations" but I would say bully gate cost us one game so I would say he overachieved by at least a full game this season. I would give him one more year to further implement his system and if we don't see a major improvement kick him (and tannehill with him) to the curb.
The disturbing thing for me is the regression in the passing offense input into this equation, which is the most heavily weighted one, and which was driven downward in large part by the number of sacks the team took.
 
Nice post.

The numbers dont lie here.

We were a much better secondary thanks mainly to Grimes.

Our offense is terrible.
 
Our YPPA Differential was weak all year. Tannehill needs to find 7.4+ yards per attempt next season.

This season we made plays in the secondary when they were available, yet it still didn't translate to a playoff berth. It's a virtual certainty we'll decline in that area next season.
 
Our YPPA Differential was weak all year. Tannehill needs to find 7.4+ yards per attempt next season.

This season we made plays in the secondary when they were available, yet it still didn't translate to a playoff berth. It's a virtual certainty we'll decline in that area next season.
I haven't studied it (yet), but I suspect the deep ball inaccuracy is largely responsible for the YPA inadequacy at this point. The decline in net YPA on the other hand was due primarily to the increase in sacks.
 
The disturbing thing for me is the regression in the passing offense input into this equation, which is the most heavily weighted one, and which was driven downward in large part by the number of sacks the team took.

But I thought you had objective evidence that sacks have little to no impact..... :ponder:

---------- Post added at 09:12 PM ---------- Previous post was at 09:10 PM ----------

I haven't studied it (yet), but I suspect the deep ball inaccuracy is largely responsible for the YPA inadequacy at this point. The decline in net YPA on the other hand was due primarily to the increase in sacks.

WTF? Do you have split personality disorder?
 
Great work and great post. Curious if the outcome variable was binary (win/lose) and if so, did you use OLS or logistic regression? The latter might provide more accurate coefficients though unlikely to change the directional nature of the model. Also, are you compiling data for the time period post 2006? Curious as the game continues to evolve / change though in the case of our fins it appears it held up well for the past 2 seasons. Really nice stuff!
 
Do you get extra credit in college or something for all this work?

Quite the statistician indeed. Sheesh.
 
Nice work!

Maybe you should send a resume into Jeffy or better yet Ross.:lol: You never know ...
 
The disturbing thing for me is the regression in the passing offense input into this equation, which is the most heavily weighted one, and which was driven downward in large part by the number of sacks the team took.

what about the major regression of the run defense
 
what about the major regression of the run defense
In terms of the variables that are most strongly correlated with winning, that doesn't show up in the data.

---------- Post added at 01:51 AM ---------- Previous post was at 01:49 AM ----------

Do you get extra credit in college or something for all this work?

Quite the statistician indeed. Sheesh.
No, but I get to have a little more insight about the team than I would otherwise, so I'm not blindsided later by expecting something from it that I probably shouldn't have. ;)
 
Great work and great post. Curious if the outcome variable was binary (win/lose) and if so, did you use OLS or logistic regression? The latter might provide more accurate coefficients though unlikely to change the directional nature of the model. Also, are you compiling data for the time period post 2006? Curious as the game continues to evolve / change though in the case of our fins it appears it held up well for the past 2 seasons. Really nice stuff!
No the dependent variable is win percentage, which is continuous.
 
Back
Top Bottom