Illustrating the Projection Problem: Real Life Example

Last post I wrote about how accurate projections have to regress player performance to the mean. I used the example of AB, since it is simply a measure of playing time and role, not subject to the variance that Hits is, for example.

Here is a chart showing the Top 10 2009 AB Leaders, and how they did in 2010.


Player                2009              2010           Reason
A Hill                 682                 528            Injury, suckiness
J Rollins 672 350 Injury
O Cabrera 656 494 Injury
N Markakis 642 629
I Suzuki 639 680 Healthy
R Cano 637 626
R Braun 635 619
M Tejada 635 636
D Jeter 634 663
B Roberts 632 230 Injury
AVERAGE            646                546            Decrease: 15%

The 15 percent shortfall in the following year is greater than usual. The dropoff from year to year is usually about 10 percent. Part of this is due to the severity of Brian Roberts’ injury. Although he played opening day, he was banged up enough to warrant reducing the AB in his projection a signficant amount.

But what to do about the lost at bats from Hill, Rollins and Cabrera? If we have no idea which players are going to play less, should we subtract them from the group as a whole, meaning that nobody’s projection looks exactly right, because in fact it isn’t, or should we ignore the fact that these AB will go missing, and project everyone as if they will play full time, ensuring that the projections will be more wrong overall?