Thursday, February 16, 2012

MLB Payroll and Performance

Last season after the World Series I wrote about the payroll and performance hypothesis in MLB and questioned how strong this relationship was during the 2011 MLB season. At that time I admitted that only looking at one season was not a long enough time period to make a definitive conclusion about payroll and performance. So I thought that I would revisit this over a longer time period in MLB.

So I am going to look at the entire time period that the USA Today has MLB team payroll data, which is from 1988 to 2011, and then using for the team standings data, I am going to compare the impact that relative payroll has on team regular season performance.

Why do I use relative payroll? It is a statistical reason, but let me see if I can explain. Of the two variables, winning percent is stationary - in other words average winning percentage for each season is 0.500, but total payroll is non-stationary - in other words the average of total payroll is increasing. Running a regression using a non-stationary independent variable (total payroll) will result in a poorer performing result if the dependent variable (winning percentage) is stationary. Thus to get both variables stationary, I convert the non-stationary total payroll variable to a stationary variable (relative payroll). Relative payroll is calculated as team i's total payroll divided by season j's average payroll. For example, for the Philadelphia Phillies in 2011, I took Philadelphia's total payroll of $172,976,379 and divided by the average of total payroll for the 2011 MLB season, which is equal to $92,872,043 which gives me a relative payroll for Philadelphia equal to 1.8625.

So after running the numbers, I find that relative payroll in Major League Baseball "explains" in a statistical sense only 17.6% of team winning percentage from 1988 to 2011. Another way of looking at this is that relative payroll explain less than 80% of why MLB teams win at different rates. If I fail to take into account this non-stationary issue and just use total payroll instead of relative payroll, then the explanatory power drops to 6.8%.

Hence for this reason, I seriously doubt that payroll is a good indicator of regular season team performance in Major League Baseball.

Still not convinced? Fine - tomorrow I will post a step-by-step procedure as how I calculated this result.


  1. This finding does not surprise me. MLB teams could be viewed as operating much like a factory. If a factory spends the most on assets it does not guarantee that it will manage those assets in a manner that maximizes profits (wins). Also, it may be difficult to manage the assets properly, given the value of the assets may depreciate very rapidly do to unpreventable breakdowns (injuries). This risk of break down means every team should expect some percentage of the payroll to be a sunk cost.

    I would be interested in seeing how payroll predicts the success of a team in the playoffs. Limiting the data to the play offs would limit the comparison to the teams that have best managed their assets, (and the risk of injury is would be much lower since the playoffs are significantly shorter than the regular season).

    Another thing to consider is some divisions tend to have higher payrolls (AL east) but those teams play each other more. If the high spending teams are competing a disproportionately number of games against them selves, and the same is true for the lower spending teams, we would expect to see a lower correlation between payroll and win percentage. You could develop a production model (like your college football model: runs scored,errors, etc) and compare the the payroll to the productivity of each team. Another thing to try is checking the correlation within each division. You could use records of only the division match ups and average payroll from each division.

  2. Nathan,

    You have a lot of really good ideas, and I hope later on this year to come back and look at some of these. Right now, my attention is turning towards last season's batter evaluation now that the data is posted. I hope to have this done in the next couple of days.