What is a projection?

The Book Blog

Tom Tango looks at some assumptions we make about forecasting player performance, and looks at the race for 2009 HR champ to illustrate how the high variability of performance means that our forecasts–even if we show just one number–really outline a range of scenarios. In the optimistic one, perhaps Albert Pujols plays the entire season, faces weaker than usual pitchers, and matches the scenario at the top of the forecaster’s scale.

In the most pessimistic scenario, Pujols throws his back out the day after the forecast is made, and doesn’t play all season long. So, if he might hit 50 at the top end and 0 at the bottom end, what is a good projection for Pujols?

I’m not a fan of assigning percentiles of probability, as PECOTA does, because they don’t really mean anything real. From the comments on this post of Tom’s I learned that it seems that PECOTA applies the same distribution rules to all players, which may match the knowledge we have now, but certainly doesn’t give us any more information.

To make my projections I’ve run regressions that give me a baseline formula for using the information available in a player’s past performance, modified mostly by age. The problem with this approach is that the regression uses absorbs the volatility of the sample and spreads it throughout. So, if I apply the formula to the Top 100 projected hitters, I get a projected number of at bats (and other stats) about 10 percent less than the Top 100 hitters produced the preceding year. That loss can be attributed mostly to injuries, since these are generally reliable players.

The problem is that applying this loss across the board makes all the projections look weak. No player gets 600 AB, nobody hits 40 homers, things just look wrong. This is exactly what Tom Tango’s “dumb” projection system, Marcel the Monkey, does. Marcel only looks at previous stats and age and applies its regression formula. This gives an excellent projection of where production will come from and how much production will come, but while it tests well it doesn’t look right.

Projections are pretty limited in their applicability to allied uses, like team forecasts, but they are a good way to present the information about what is expected of a player. Does he run? Does he hit for power? The projection aggregates the information we have about a player and comes up with a compromise view that helps us smooth over the ups and downs of individual statlines. But to make it look right, you have to add that 10 percent of at bats and stats back into the individual lines, even though this means projecting too much stats overall.

Nowhere did this become evident more quickly than in the chart Tom ran in his post showing Marcel’s top 13 HR projections for 2009. In the first column are the Marcel forecast HR for each player. In the second column is the player’s name. In the third column are my 2009 projected homers. In the fourth column are my projected homer totals reduced by 10 percent. In the fifth column are the actual total homers.

Marcel Proj Hitter PK Proj PK Adj Actual
40 Howard, Ryan 46 41 45
32 Rodriguez, Alex 24 22 30
32 Fielder, Prince 33 30 46
32 Dunn, Adam 37 33 38
32 Braun, Ryan 40 36 32
31 Pujols, Albert 38 34 47
31 Pena, Carlos 33 30 39
30 Thome, Jim 34 31 23
29 Dye, Jermaine 34 31 27
28 Delgado, Carlos 27 24 4
28 Cabrera, Miguel 38 34 34
28 Berkman, Lance 36 32 25
28 Beltran, Carlos 31 28 10
401 TOTAL 451 406 400

As Tom concludes, we can get the right number for the group. The real question is what do we want the projection to do? The post is well worth reading, as are the comments, if you’re interested in this murky side of the sabremetrics game.

Get Off My Lawn – Minor League Ball

by John Sickels

John writes one of those tough screeds that sound, about halfway through, like the complaining crap of an old man. But John isn’t nearly as old as he thinks he is, and what he’s writing about is something I hope all of us who care about baseball and stats and the data have already thought about.

The point is that thanks to Pitch FX and the efforts of BIS and MLB and everyone else scoring baseball games,we’re getting a ton more information about every pitch in every major league game. And the automation of this process promises even more in the coming years.

Much of this data, thanks to MLB by the way, is available to everyone, and so it has become a happy sandbox for baseball fans with a fondness for math.

John’s gripe, if you can call it that, is that all these analysts are sorting through the data and ending up with micro conclusions that don’t really mean much to someone watching any particular game.

What I would add is that we know an awful lot about baseball because of the things we’ve learned before this great outpouring of pitch by pitch data. Much of what we learn after all the new data has been processed and tested and used is going to support the observations of those who watched the game closely before all the data was known.

When I’m grumpy I wonder why I’m reading yet another study that confirms what we already knew about this or that baseball situation. But that doesn’t mean those studies aren’t important. We gain the most knowledge by testing everything, each situation and contingency and viewpoint, and then see what shakes out. Confirmation means as much as a fresh idea.

Despite all the noise out there, that’s what’s happening now. John recognizes that, but he’s honest enough to point out that it makes him weary. Me, too.

Convince your league to replace BA with OBP

Rotographs

In standard 4×4 and 5×5 leagues, OBP is clearly so much superior a rate stat to BA and we all know it, that I’m shocked everyone hasn’t made the change. Once you’re tried it you’ll never go back, because players values actually reflect their values (minus defense) in the major leagues.

But it’s hard to get people to change, which is why only one of my leagues use OBP instead of BA. We’ve talked about making the change in Tout Wars, but since part of the league’s goal is to offer draft guidance, it isn’t going to happen until you all switch over. Get going!

Forecaster and Handbook are out!

I got my copy of the Baseball Forecaster about 10 days ago, but closing the magazine meant not cracking it, even though I’ve got a short bit in it (which happened to run here first, about WHIP v. WH/9), until now.

Ron’s lead essay is very smart. It’s about how wrong we are about players, year after year, and he wonders why we pursue exacting but nearly always wrong projections. Then he comes up with something new, called the Mayberry Method.

There’s a lot to like about the way the MM summarizes a player’s skills in a descriptive way. Yet despite it’s simplicity, I’m not convinced it is going to catch on. New stuff often doesn’t, even when it has real merit. On the other hand, the benchmarks MM describes so succinctly are becoming increasingly entrenched as leading indicators, making me wonder why–if we’re getting better at defining leading indicators–we’re not getting better predicting breakouts.

As Ron says in the piece, we may be smarter now than we were 20 years ago, but that may not be such a good thing.

Steve Moyer always gives us so-called experts a copy of the hot-off-the-press Bill James Handbook at First Pitch Arizona, for which I am very grateful. Not that I wouldn’t buy it, I have many times, but this way it ends up in my hands even sooner.

The book continues to grow, with increased focus on the defense awards and rankings, focus on baserunning skills, and the ever useful park factors. I’m a great fan of baseball-reference.com and fangraphs.com, both of which I use all day long, but I sit and read the Bill James Handbook, poring over its pages as if it were a ripping good yarn, which in many ways it is.

I’m glad for both these books and recommend them highly.

The Forecasters Challenge 2009–revisited again

The Forecasters Challenge 2009

Tom Tango said today he’ll be running the Forecasters Challenge again in 2010. The primary judging will come from the Pros-Joes format, which is described in the link above. The idea, basically, is to have each pro draft against 21 inferior lists. In last year’s challenge my projections ranked 3rd using this method.

For the record, using the 22 pros against all the other pros, my projections ranked 5th.

In the head to head scoring system, I was second division.

Overall, Rotoworld and John Eric Hanson seemed to score the best.

salary vs performance

ben fry

There has been a lot of talk about the Yanks buying the pennant, which ignores the fact that for eight years they bought the pennant but lost. I have a hard time working up to umbrage, but I do think it’s hard to judge the Yanks a great team because of all the extra money they spent.

Or rather, they may be a great team, but that’s because of all the money that was spent. The good news is that Cashman finally got it kind of right.

Ben Fry has charted the standings for the 30 teams based on their standings throughout the season. I’m not sure you learn anything concrete from this, but it’s a beautiful chart nonetheless. Have fun with the slider up top.

The Forecasters Challenge 2009–final results

TangoTiger.net

I’ve written about Tom Tango’s Forecasters Challenge here before. Tom asked many of us to contribute our preseason rankings of baseball players based on a metric he devised to calculate a player’s contributions on the field. His plan was to run thousands of drafts from these lists. The team that performed best would be judged to be the best, most useful projection system.

There was lots to like in this approach, though as Tom details in the report linked to here, there were also some surprises. He writes about some of the key structural ones, which have led him to run other iterations of the drafts, trying to find a format that gives a more nuanced judgment of the relative lists.

There are three other points that I think should be made.

First, there is a good chance that the weighting between hitting and pitching is off. This is certainly true of my team (which finished fifth of 22 in the original contest). Whether this is because I weighted hitting and pitching the same, which I did, or because I didn’t discount pitchers for their unreliablity, which I didn’t, or because I just undervalued hitters, something was off. Looking at the two components individually, which Tom has said he will do, should help us better understand how the original contest worked.

Secondly, not everyone used straight projections. Some systems weighted for position scarcity. This wasn’t prohibited, so I’m not complaining, but when it comes time to analyze the results it should be understood that in at least a few cases sardines are being compared to mackerels. A simple correlation of all the projections systems to the final actual ranking would be of interest.

Thirdly, as Tom notes about how Marcel handles players with no ML playing time, all systems use a sort of generic noise projection for the marginal players. This means in a correlation study that the noise can overwhelm the estimates of what players expected to have regular playing time will do. For this reason, I don’t think it would be a bad idea for Tom to run the drafts using a 12 or 15 team league format, so that not every projection system is in every league. This would mitigate the problem of small ranking differences being exagerated by the draft procedure, and may give us a better result. His head-to-head matchups are interesting, too, especially since so many ranks changed dramatically, but another angle of analysis on the data would certainly help us figure out what is better.

These notes are not meant to be critical in any way. Tom’s enterprise has thrown off a whole bunch of interesting data, which I hope he will keep returning to all winter long. Once the magazine is done I expect to dig in, too. He deserves a mountain of credit for conceiving this project and seeing it through. Ideally, we’ll be able to do it again next year with a better idea of what we’re going for. Thanks Tom!

Save $25 on First Pitch Arizona!

For what I think is the seventh time I’m heading out this November for Ron Shandler’s First Pitch Arizona symposium. This year’s dates are November 6-8, though I’m flying the fourth so I can get in a game on Thursday afternoon.

You cannot imagine how great it is to watch some of the best young talent around (this year we have Stephen Strasburg) in a near empty park, allowing you to sit just about anywhere you want (including behind home plate, where you can sometimes spy the radar readings of the ML scouts who are always in attendance.

Read more

I Love New Metrics!

Except when I don’t.

This story is about O-Swing %, which measures the number of times a batter swings at pitches out of the strike zone. The writer says that O-Swing % is really interesting, and then goes on to prove (unless his numbers are wrong) that it is pretty much meaningless.

What is actually interesting is that the writer does a decent job of demonstrating why the apparently broad swing in O-Swing % numbers is meaningless. It boils down to the fact that some batters swing more, and so they hit the ball more. While some batters swing less, and hit the ball less. Consider 0-Swing % exhausted, at least for now.

When there is reliable pitch location information there will doubtless be information derived from these numbers that will be of interest, but it certainly won’t be simple or absolute. The game isn’t simply a matter of cause and effect, but a complex system of adjustments and readjustments that change how everything happens. It seems to me the miracle is that the game is played on the same sized field now as it was 100+ years ago. In that context, the variation in results should lead us to explore what changes have been made.

But that has nothing to do with O-Swing %.

Origins of Major League Starting Pitchers, 2008 –

 Minor League Ball John Sickels looks at all the starting pitchers with 10 or more win shares in 2008 and looks at where they were at when they stepped over to the professional game. First rounders have a big edge, but what stands out is that successful pitchers come from everywhere.

A similar list tracking the last 20  or more years would be of great interest, if anyone has time tomorrow (or the next day), since the list itself isn’t exactly objective. I would assume that the way scouts and organizations work has changed over the years, and this would be reflected. Or, more tantalyzingly, maybe not.