I used to evaluate my projections each year against what really happened. When I started doing this, 22 years ago, I was ambitious and driven to get better results, but after many years the limits of our predictive ability are obvious. The bottom line is that there is no way to predictively model the game’s stochastic nature. Random stuff causes a significant percentage of the events that happen on the baseball field, and trying to guess whether those random events are going to go one way or the other is absurd.
Or rather, trying to guess them accurately is absurd. We continue to make our guesses, and we (and I mean all baseball predictors) continue to get something like 75 percent accuracy (which is a 25 percent failure rate).
That significant variance also makes it hard to judge whether improvements are actually improvements or not. If I score around 75 percent each year, does it mean something systematically when that number dips to 73 percent, or bumps up to 77 percent, the next year? Or is that a reflection of randomness? At the very least it’s hard to tell.
But I didn’t stop measuring because it’s hard to tell. I stopped because the timing was bad. The season ends and I’m all in on the Guide, there isn’t much time to do the comparison, and not much urgency given how little there is to be learned.
When the Guide is done, it’s the holidays, and then a rush to complete the much more extensive projections in the Patton $ Software by the end of January. And once that’s done, there’s getting ready for the season. But this year, someone asked specifically for an evaluation, so I put together a spreadsheet. You can see it here.
The first thing I check is the overall accuracy of the Top 100 hitters and pitchers. That is, I look at the Top 100 hitters projected for the most at bats, and compare their category totals with what actually happened. This year:
AB: 94 percent
R: 99 percent
2B: 93 percent
3B: 90 percent
HR: 114 percent
RBI: 100 percent
BB: 101 percent
K: 96 percent
SB: 86 percent
CS: 85 percent
BA: 101 percent
Not perfect, and not necessarily imperfect in obvious ways.
One of the standard ways to measure the accuracy of projections is to use the Correlation Coefficient, which measures the extent to which two variables (in this case the 2016 projection and the 2016 actual result) have a linear relationship. To be a little more brass tacks about it, a correlation of 1 means that two sets of data create the same angle when graphed, even if they show up in different parts of the graph. 3, 4, 5 would have a correlation of 1 with 5, 6, 7.
A correlation of 0 means that the two data sets are completely unrelated to each other.
Most interestingly, a -1 correlation would mean that the second data set would be at a 90 degree angle to the first. Negatively correlated.
With that in mind, here are the correlations for my 2016 projections compared to what actually happened.
The first thing to note, for my self esteem, is that when we look at the Top 500 projected hitters, I hit the 75 mark in AB, R, HR, RBI and, almost, SB. That’s the holy grail, I think. You want your set to reach .75 in correlation. That’s a pretty good correlation, if you know what I mean.
But, and big but, the numbers are much more problematic when measuring the Top 100 projected hitters. AB is a mess, but oddly HR, RBI and SB aren’t that bad. Remember that .75 is about as good as it gets, though that statement comes with provisos.
What I’m getting at here is that there are many ways to evaluate projections.
If you look at the whole data set, as we do here in the Top 500 projections, we get about the results we hope for. This is the limit of a baseball projection, or close to it.
Another way to evaluate projections is to sort by the actual number of at bats players actually had. This gives you a list of the most active players on the year, and how best we predicted that.
A little better, it turns out, which means that we’re doing better predicting who actually plays and how they produce than we are predicting what the most predicted guys are going to produce. By a little.
Pitchers are going to have to wait for later, but I hope this gives a little bit of a taste about what projection reviewing means. Maybe we’ll take a look at some other systems, too, coming up.