Pitching by the Numbers: Digging into BABIP

Michael Salfino
Yahoo! Sports

I'm very conflicted about Batting Average on Balls in Play (BABIP). It definitely helps quantify randomness in baseball. But it does it so precisely that it beckons us to completely discount pitcher skill in inducing more weakly hit balls and it also discounts projectable defensive skill, as if batted balls randomly find their way into gloves.

We've dealt with this before here, in March, when we noted the BABIP bias against fly ball pitchers. And inspired by Daniel Kahneman's new book, "Thinking Fast and Slow," I was tempted to use it as one of the statistical measures to predict pitching excellence formulaically. In other words, I was going to assume that a good BABIP is really a measurement of skill rather than luck.
But first I needed to compare 2012 BABIP allowed by pitchers to their 2011 numbers. Those who significantly beat the league average both years are less likely to be lucky in this stat and more likely to be skilled – or at least have skilled defenders (think Tampa Bay). If they are significantly worse than league average, then they are more likely to be more hittable than we want and/or have poor defenses (either way, same result near term barring real-life trades). But if it's mostly random, we should have a lot of guys who seem great at limiting BABIP one year and terrible at it the other.
For context, BABIP this year is .290 (and going up of late) and last year, .295. So the good BABIP pitchers needed to be .275 or less. The bad ones .308 or more.
Let's start with the pitchers who qualified for an ERA title both years and were .275 or less:

Player Year BABIP ERA
Ervin Santana 2012 .274 6.16
Ervin Santana 2011 .274 3.38
Jered Weaver 2012 .236 1.61
Jered Weaver 2011 .252 2.41
Jeremy Hellickson 2011 .224 2.95
Jeremy Hellickson 2012 .229 2.51
Joe Saunders 2012 .248 1.24
Joe Saunders 2011 .275 3.69
Josh Beckett 2012 .242 4.45
Josh Beckett 2011 .249 2.89
Justin Verlander 2012 .233 2.38
Justin Verlander 2011 .237 2.40
Kyle Lohse 2012 .213 1.62
Kyle Lohse 2011 .272 3.39
Matt Cain 2012 .158 2.35
Matt Cain 2011 .265 2.88
Ricky Romero 2012 .214 3.64
Ricky Romero 2011 .245 2.92
Ted Lilly 2012 .171 1.38
Ted Lilly 2011 .266 3.97
Clayton Kershaw 2012 .232 2.63
Clayton Kershaw 2011 .274 2.28

Wish there were more of these guys so that I could proceed with my plan to go through the looking glass and use BABIP as a skill measure. And note I'm not saying that good BABIP guys are good, period - just more likely to be good due to their ability to induce batters to hit into outs when making contact. Notice the fly-ball pitchers on the list. Glad to see Hellickson, who I predicted would maintain something much closer to his 2011 BABIP rate than all/most experts forecasted. Also notice the underlying basis for that prediction – Hellickson being extreme fly ball – applies to these pitchers generally.
Obviously, Cain's rate is crazy low. Same for Lilly. But expecting them or any of the guys above to correct to near .300 prospectively in 2012 seems crazy, too, to me at least.
Now the guys who are consistently bad at BABIP:

Player Year BABIP ERA
Chris Volstad 2012 .319 6.11
Chris Volstad 2011 .317 4.89
Jaime Garcia 2012 .365 2.78
Jaime Garcia 2011 .324 3.56
Max Scherzer 2012 .453 7.77
Max Scherzer 2011 .316 4.43
Rick Porcello 2011 .318 4.75
Rick Porcello 2012 .308 5.64
Zack Greinke 2012 .373 3.94
Zack Greinke 2011 .323 3.83

Much smaller list. This lends credence to my belief that's shared by many in the statistical community that hitters generally control outcomes (a fact that if true makes most pitching projections VERY dicey). For the most part, the bad BABIP guys are running into more than their fair share of good hitting. However, the key takeaway here is that there are exceptions, meaning pitchers who control outcomes to a much greater degree than average. Those are the guys in the first chart. Again, not saying the guys in this second, bad BABIP chart are bad (though you couldn't pay me to own Scherzer, Porcello and Volstad). The point is that these guys can be safely placed in the hittable category (once contact is made).
Also, if we accept that hitters controlling outcomes is the exception and not the rule, we should have a relatively big list of pitchers who happen to run into good hitting one year and bad hitting in the other. I think if BABIP was completely random this list, while bigger than the other two, would be much bigger:

Player Year BABIP ERA
Anibal Sanchez 2012 .266 2.73
Anibal Sanchez 2011 .317 3.67
Carl Pavano 2012 .255 4.91
Carl Pavano 2011 .308 4.30
Chad Billingsley 2012 .207 2.64
Chad Billingsley 2011 .313 4.21
Chris Capuano 2012 .263 2.73
Chris Capuano 2011 .317 4.55
Ian Kennedy 2011 .274 2.88
Ian Kennedy 2012 .364 3.38
Cole Hamels 2011 .259 2.79
Cole Hamels 2012 .325 2.78
Jake Westbrook 2012 .247 1.30
Jake Westbrook 2011 .318 4.66
Jhoulys Chacin 2011 .264 3.62
Jhoulys Chacin 2012 .329 7.30
Josh Tomlin 2011 .254 4.25
Josh Tomlin 2012 .318 5.27
Madison Bumgarner 2012 .245 2.53
Madison Bumgarner 2011 .329 3.21
Matt Garza 2012 .224 2.67
Matt Garza 2011 .312 3.32
Ricky Nolasco 2012 .262 2.76
Ricky Nolasco 2011 .337 4.67
Ubaldo Jimenez 2012 .266 5.02
Ubaldo Jimenez 2011 .315 4.68

What's the fantasy takeaway with this third chart? Sell the guys who are doing really well in 2012 and buy the guys who are doing poorly (depending of course of the depth of your league; when I say "buy" and "sell" I really mean "likely to be significantly better" and "likely to be significantly worse.")
So, buy: Kennedy. That's it. No one should own Josh Tomlin even though he'll get better. Ditto for Chacin, now in the minors.
And sell: Sanchez, Billingsley, Westbrook, Bumgarner, Garza, Nolasco. And "sell" doesn't mean, "cut" or "give away." It means, "sell high."
I'm neutral on Hamels because his ERA, so key, is the same both years, meaning he's likely getting unlucky in how hits are sequenced. So that's proof that a widely varying BABIP doesn't necessarily mean a widely varying ERA.