Marcel vs. the others

Having just finished and released my projections for the Patton $ Online Software product I’m thinking about the accuracy and usefulness of projections more than usual (and I usually think about this subject a lot).

Those of us who make projections want our projections to be the most accurate, but it turns out that measuring a set of projections versus what actually happened is a complicated business. Just how complicated becomes clear if you read the first two parts of Tom Tango’s analysis of five different projection systems from 2007-2010.

But you don’t have to, Tom says you can skip those parts, and you’ll still appreciate the results, which show that CHONE was probably the best projection system in recent years, but that it wasn’t much better than Marcel, which Tango invented as a simple baseline projection that could be measured against more sophisticated systems to evaluate them. If they don’t do better, they aren’t adding value.

The question is how much value any of the systems is adding. The answer depends on what you’re looking for, but the assertion by one of the commenters that accurate projections probably matter most to fantasy players rubs me that raw way. As the survey results show, using projections to value players for your fantasy league isn’t going to get you very far. The margin of error for each projection is far wider than the range of projections from all the various sources.

Different projection systems incorporate different aspects of baseball analysis. My projections use complex regression analysis of previous performance, filtered first by age, and then by my tweaking.

Other systems use other inputs. PECOTA draws on similar player/career arcs to project into the future, for instance, while ZIPS and CHONE incorporate some of the newer stats to establish complex systems of regressing outlying performance to the mean.

I have my doubts how far such empirical formulation will take us toward the grail of accurate projections, the ball hasn’t moved much in recent years despite lots of new data, but all the work is necessary to tease out what real information there is to be found in the numbers. Tango’s report and the many comments that follow it are invaluable for showing what the challenges are, and perhaps eventually suggesting a way forward.