Pitching by the Numbers: Studying most misunderstood projection stat

Let’s start the 2016 season of Pitching by the Numbers by examining batting average on balls in play and the false assumptions that make it the most misunderstood and dangerous statistical tool in our projection toolbox.

BABIP was designed explicitly to isolate luck. Last year, the league’s BABIP was .299. If a pitcher allowed a higher BABIP, he was unlucky. Anything lower, he was lucky. We should expect him to allow exactly a .299 BABIP because that’s what hitters do against pitchers. Or so the theory goes.

[2016 Yahoo Fantasy Baseball is open for business. Sign up now]

But the first problem is batted ball types — ground balls and fly balls result in hits at much different rates (about .240 for ground balls and .205 for fly balls). The rest are line drives (.700-ish). But the range for batted ball speed for line drives is widely variable as trajectory is the defining characteristic of a line drive and we’ve all seen many soft ones. What if we could just isolate the BABIP that was well-hit regardless of hit trajectory? In other words, well hit grounders, fly balls and liners are more likely to be hits than those weakly hit. Could the contact quality a pitcher allows explain our low/lucky BABIP?

Enter Inside Edge, stat provide to Major League baseball. I prefer their approach of using two scouts to view each play and needing to agree on whether a ball was well hit. We all have watched many baseball games and deciding if a ball is well struck is pretty cut and dried. I’d rather bet on eyeballs here than an algorithm. Though I note for the record that there can be a wide variance between the Inside Edge data and the data on other sites.

According to Inside Edge, the average hitter hit .299 on balls in play but .167 of balls in play were well hit. It stands to reason that pitchers who allowed an unusually low average on balls in play were lucky ONLY to the degree their contact-strength of balls in play was closer to average. But if a pitcher had an unusually low rate of well-hit balls in play, then having a low BABIP makes sense. It’s not obviously a product of sheer luck in that case. 

[Fantasy Baseball Rankings: Top 250 | 1B | 2B | SS | 3B | C | OF | SP | RP]

So we charted up the Major League leaders in BABIP last year to test the theory. Maybe some of them would be average or worse in hard-hit rate of their BABIP and we could more confidently predict regression. But instead what we find is that all of the top BABIP pitchers were good to great in allowing weakly-hit BABIP. Instead of a well-hit BABIP average of .167, they ranged from .100 (Sonny Gray) to .147 (Hector Santiago). To make the math really simple, the average pitchers had an actual BABIP .132 points above his well-hit BABIP. So let’s just add .132 to the well-hit rate of the “luckiest” pitchers and ballpark what their expected BABIP should have been in 2015.

MLB Players Ranked by 2015 BABIP | PointAfter

When we compare “expected BABIP” to “actual BABIP” we find that some of the low-BABIP list were actually quite unlucky. In other words, their BABIP should have been even lower. Look at Matt Harvey for example. He was 14th at .277 BABIP but since only .108 of his BABIP was well hit, we’d expect about a .240 average on it. Yet FIP and many touts will look at Harvey’s .277 BABIP and regress it up — the wrong way. The smart play with Harvey is projecting his BABIP to be even lower than last year. I’d peg it at .260. His teammate Jake deGrom (.114 well-hit BABIP) also should be viewed as unlucky in BABIP (it was .277, too, but should have been about .246). Of course, the Mets defense could be a factor here but not reasonably to this degree.

Clearly there is luck/variance in BABIP. The correlation between the BABIP our leaders allowed and their rate of allowing well-hit BABIP is poor. Zack Greinke should have had about a .270 BABIP, not .232. So that’s not likely able to be repeated. But the direction of Greinke’s BABIP — it being better than average — is bettable because he was much better than average in allowing weak contact.

But many of us will look at a guy like Marco Estrada especially and give his BABIP the stink eye and just hammer him on regression that sure seems less likely when you look at the batted ball quality he allowed. Estrada was 32.3% better than average in preventing well-hit BABIP. Doesn’t that earn you significantly less hits allowed? If we say that low hit-quality on BABIP is due to bad  hitting and not pitching and thus also luck, we’ve relegated pitchers to mere bystanders and that’s clearly false given the repeated excellence of many. 

Dallas Keuchel is the most interesting name on this list because he’s a ground-ball pitcher. So he’s not only allowing weak contact but the contact he allows is relatively harmless to begin with (very few ground balls result in extra bases). Yet the experts project his ERA to track his FIP ERA (meaningless for him, our model says). And thus he’s going in the second-tier of pitchers when he clearly seems Tier 1 to me with an expected ERA in the mid-2.00s and not the low 3.00s we’re seeing in most projections.

So let’s reorder the BABIP leaders by their expected BABIP that’s tethered to how well their BIP was hit. These are the guys we can bet on to beat the balls-in-play odds again: Sonny Gray, Jake Arrieta, Keuchel, Harvey, Estrada, deGrom, Garrett Richards, Michael Wacha — I’ll be surprised if any of them were over .280 this year in the statistic.

For fantasy purposes, I’d confidently project deGrom, Harvey and Keuchel as No. 1 fantasy starters who you can get in the second tier about round four or five. Richards and Wacha are the prototype middle-round starters who are bettable as a fantasy No. 2. 

Estrada’s problem is pitching in the American League East but he’s a decent reserve pick in mixed leagues, where you will get him. Gray is not a bargain, as I put him in the tier with Richards and Wacha but he’s going about three to four rounds earlier.

More Fantasy Baseball analysis from Yahoo Sports