In our book The Wages of Wins we talk about when performing statistical analysis; for samples, size matters. Specifically, the larger the sample size (more relevant data), the more accurate (smaller the standard error) are the statistical results. Thus looking at a team over a few games or a league over a couple of years and making definitive statements about the team or the league is a highly dubious endeavor.
How do we come up with such a conclusion? Here is a guide for further reading:
Why are small sample sizes a problem, and thus need to use a longer time period?
A step-by-step guide on how to perform the payroll and performance analysis.
Why do I use relative payroll (3rd paragraph)?
Why do I use R-squared (or adjusted R-squared) as our measuring stick?
Recently I read the following: NBA teams in the top 10 of payroll for the past two years have all made the playoffs, and thus the author concludes that the following teams will make the playoffs this year: Nets, Knicks, Heat, Bulls, Lakers, Raptors,
Clippers, Celtics, Thunder and Pacers. Others looking at on-court performance have left out some of the teams in the payroll top 10 in their predictions.
So payroll "predicts" ten of the sixteen NBA playoff teams. I think this would be much more convincing if the top 16 teams in terms of payroll made the playoffs for say ten NBA seasons. That would give much more credibility to this type of statement. Since this statement only looks at ten of the possible sixteen playoff spots the prediction is only 62.5% accurate - or leaves 37.5% of the playoffs teams unexplained. For statisticians this is a large error.
So, what I thought I would do was look at the relationship between NBA (relative) payroll and NBA regular season performance. To do that I need to get NBA payroll data, which is provided at University of Michigan Professor Rodney Fort's website (direct NBA payroll link) and since I was already there, I used his data for NBA regular season winning percentage (direct NBA winning percentage link) to perform this statistical analysis.
Starting with the 1990-91 NBA regular season and finishing with the most completed NBA regular season (2012-2013), performing a linear regression (controlling for heteroskedastity) on the plus side the statistical analysis results in relative payroll to be positive and statistically significant but on the down side reveals that relative payroll and winning percent have almost 11% of common variation, meaning that relative payroll fails to explain about 90% of regular season winning percentage. Frankly, that is rather poor if you are using payroll to "predict" performance. If we take a look at the last two NBA seasons (which one was a lockout season), we see that the relationship between relative payroll and regular season winning percent improves to about 23% using adjusted R-squared. Again, for the last two seasons (one with a lockout) the variation in relative payroll doesn't even explain a quarter of the variation in NBA team winning percent.
If we use a bigger recent sample (five years which covers the 2008-09 to 2012-13 NBA regular seasons) we see that the amount of variation in common between relative payroll and regular season winning percent is still about 23%. Taking the five years before that (2003-04 to 2007-08 seasons) things are vastly different. Relative Payroll is statistically insignificant, or in other words, statistically has ZERO impact on winning percent. Even if you can overlook that the amount of variation in common between the two is 1%. Hence the relationship between relative payroll and winning percent has improved in the past five years as compared to the previous five year period. Again, to me not that convincing of a relationship.
As always, the proof is in the pudding - so I will come back to this at the end of the 2013-14 NBA regular season, let's take a look at how well payroll relates to getting into the NBA playoffs.