Statcast wants to change how we're consuming baseball games

MLB columnist
Yahoo Sports
Major League Baseball has long bought into the power of data. (AP)
Major League Baseball has long bought into the power of data. (AP)

Sometime soon, there is going to be a new version of Wins Above Replacement available, and its goal, aside from encapsulating a player’s value into one tidy number, is simple: Don’t be scary. The plan does not involve dumbing down the metric that serves as the flashpoint between those who yearn for a catch-all and those who lament it. On the contrary, as with almost everything it does, Major League Baseball Advanced Media wants to make it so smart people can’t help but like it.

BAM, as the company is known, is MLB’s greatest success story of the past quarter century, an investment of a couple million dollars from each team that blossomed into the largest tech outfit on the East Coast and the hub for much of the video that streams into American households and devices. Even the lowest-revenue franchises would sell for more than a billion dollars today in part because of their 1/30th stake in BAM.

Over-the-top video is far from their only product. Two years ago, BAM introduced Statcast, a system that married high-definition cameras and Doppler radar to track every movement on the field. Its intent wasn’t merely to provide teams with a treasure chest of data. BAM understood a very important point: With the power of its data, baseball could write its own narrative. All it needed were the right evangelists.

At 12:45 ET this afternoon, two of them will stand together and preach the gospel to the like-minded during a presentation at the MIT Sloan Sports Analytics Conference, the industry’s pre-eminent gathering of data hounds. Daren Willman, 34, was so good at parsing Statcast data as a hobby that BAM hired him. He joined 35-year-old Mike Petriello, who serves as something of a public translator, explaining the whys of Statcast’s discoveries in print and on TV. And then there’s Tom Tango, which isn’t his real name. He cares not to provide his age, either. All that matters is he’s widely recognized as one of the greatest sabermetric minds of this generation, if not ever, and another get-together, the SABR Analytics Conference in Phoenix next week, will be his public unveiling after a couple decades of anonymity.

During their hourlong presentation, the three plan to run through a slew of ideas, including one that has percolated since for years: a Statcast-based WAR they hope can peel back the mystery of the metric using a data tree. WAR, for all its in-practice deficiencies, isn’t a particularly difficult concept to grasp: It places a value on measurable aspects of a player’s game and compares it to a theoretical Triple-A-quality replacement scrub.

BAM wants anyone interested to understand from whence its WAR came. So the idea is for a tree – one that allows a click or tap on the overall number, which then branches into components (offensive, defensive, baserunning, etc.) that show his overall value in each. From there comes a breakdown of more subcategories, and subs of subs, and on until piecing together WAR isn’t so daunting.

They’re also toying with the idea of an alternative presentation for their WAR. Even though the current one correlates quite strongly with overall victories in a season when adding up individual WAR contributions from each team, BAM’s audience is different than the B-R/FanGraphs niche. BAM could, for example, judge players on a 1-to-100 scale using the same criteria as another would for WAR.

“There’s an argument to be made for putting out an extra scale,” Petriello said.

“What will make our version of WAR intriguing,” Willman said, “is the way we’re going to make it accessible.”

Not just with WAR, Tango said, but all of Statcast: “We’re in the third inning.”

The first and second have been voluminous. Last year, Statcast tracked 1,435,241 pitches, followed 328,405 balls in play and spit out 53,380,301 metrics – a fraction of its capabilities. Statcast cameras track players at 30 frames per second and assign each an X/Y coordinate relative to his place on the field. On a day with a full slate of games, the system produces 21 million X/Y coordinates for player positioning alone. Over the course of a full season, that’s 4 billion data points.

When asked why they created Statcast, BAM CEO Bob Bowman said it was “our Six Million Dollar Man theory, which is that if we can do this, we should.” They’ve done it, and they could be on the cusp of something revolutionary. Statcast isn’t here just to help the world understand the game better.

It’s here to fundamentally change how we consume baseball.

Sabermetrics were born in the 1940s with Allan Roth, the egghead who brought statistical analysis to the Los Angeles Dodgers. Bill James mainstreamed them, more teams adopted them, Moneyball glorified them and some of the brightest minds in sports won championships because of them. And yet outside of some marginal but seminal advancements – the increased value of on-base percentage, the emphasis on pitcher strikeouts, the acknowledgement defense matters and can be measurable – sabermetrics remain a fringe area integrated into the game not seamlessly but by winning over converts or simply brute-forcing it. On some TV and radio broadcasts, the viewer will take its medicine and like it.

This struck Bowman, one of commissioner Rob Manfred’s closest advisers, as ridiculous. If the idea of sabermetrics scares some, it’s incumbent on him, someone who recognizes their value, to tailor them to an audience that is plenty capable of appreciating them. The best way to do that, he figured, is mix the medicine with some chocolate ice cream.

“What we’re trying to do is we want to make it relevant and relative,” Bowman said. “Relevant to what fans are watching right now and relative to other players and similar situations.”

Essentially, Bowman wants to simplify. The advent of Statcast was glorious because the type of data it procured. It timed an outfielder’s jump to the hundredth of a second and calculated his speed and offered all kinds of nuggets of nerdery.

“Some people do want all that and some people don’t,” Bowman said. “For those who don’t, what we were doing in the past was adding five statistics. All these things that actually yielded one: catch rate. We’re mindful that there are people out there who want to say, ‘Did he make the catch or not?’ We’re trying to add one piece of data that even casual fans would say, ‘I can listen to that.’ ”

Major League Baseball Advanced Media CEO Bob Bowman is one of commissioner Rob Manfred’s closest advisers. (AP)
Major League Baseball Advanced Media CEO Bob Bowman is one of commissioner Rob Manfred’s closest advisers. (AP)

Catch probability is a single number, presented in percent form and based on just two measurements: the distance an outfielder runs and the time a batted ball is in the air. In the Sloan presentation, they show Matt Kemp taking a circuitous 65-foot route to a ball that hangs in the air for 4.3 seconds and making a diving catch. It looked tough. In reality, it’s a routine flyout, caught 75 percent of the time. The slide that shows Billy Hamilton, on the other hand, has him covering 71 feet in 3.7 seconds and flying to glove a ball tracked down just 7 percent of the time.

“Then you get into all the in-between ones,” Tango said. “A 40 percent play and 80 percent play are very close. Less than a second of hangtime. Fifteen to 20 feet of positioning. At a single-play level, that’s where this thing is going to shine.”

That’s part of the excitement: Defensive WAR has been more guesswork than exact science. Statcast exists for exactitude. Even better, Statcast takes only 10 to 12 seconds to give a play’s precise details, meaning before the next pitch anyone who cares to will be able to contextualize just how good – or at least rare – a catch really was. BAM’s data warehouse then can be queried to provide context, and highlight clips of similar or better catches can be compared and contrasted on demand.

The hope is to integrate Statcast into every pregame, game and postgame – for it to be brute-forced some, sure, but more as a storytelling device than a number. In the matchup of a great base stealer against a pitcher with a poor pickoff move, it’s easy to highlight the leadoff threshold that more or less guarantees a stolen base. After Bryce Harper reportedly injured his shoulder, the average velocity on his hardest throws dropped more than 4 mph, lending credence to the injury despite the protestations of the team otherwise.

“If they have a question, we’ll have an answer,” Tango said. “But if they don’t have a question, we’re not going to force them to look at anything.”

On any given pitch these days, Statcast tracks roughly 686 metrics. Some find wonky use. Seth Lugo, an anonymous Mets pitcher who joined the rotation last season because of injuries, never would have been known as the King of Statcast because of his absurdly high curveball spin rate. Most float into the ether of BAM servers, perhaps to be rescued some other time.

Already over the past two years the data warehouse – where right now Willman’s product, Baseball Savant, categorizes and offers it for free – has grown up to allow far more complicated queries without the wait time. Statcast zoomed through its childhood. Now, like every adolescent, it just wants to understand its power and figure out how to wield it properly.

Statcast cameras record high-def video of every pitch of each game at all 30 MLB parks. (Fort Worth Star-Telegram via Getty)
Statcast cameras record high-def video of every pitch of each game at all 30 MLB parks. (Fort Worth Star-Telegram via Getty)

Because more is coming. Over the past year, a prototype 8k video camera has been integrated into Statcast for testing. The video is so stunning, so rich, that the system theoretically would need only one camera, high above home plate, to capture every movement on the field with as much precision as it does today.

It’s vital for Statcast to function. The data needs to be good, safe, secure and bountiful, because Willman, Petriello and Tango always are trying to use the gift of literally billions of data points and try to wrangle them. Last year, BAM introduced something similar to catch probability for the hitters called barrels. It uses two variables: launch angle, or the degree loft a hitter generates on a swing, and exit velocity, or speed of the ball off the bat. There is a perfect sweet spot: 26- to 30- degree launch angles with 98-mile-per-hour-plus exit velocity tends to produce a home run. There are exceptions. Some poorly hit balls go out. Other monster shots fall short. Most hit the sweet-spot of the bat.

Barrels have yet to fully permeate the baseball lexicon. Maybe they will, maybe they won’t. That’s not the point, at least not all of it. If barrels never do catch on, BAM will try to find something that does. It may take an hour, a day, a week, a year, a decade. Nobody knows. They just keep hunting for the next great idea while trying to appreciate the challenge.

“We’re going to be able to find things in this forever,” Willman said.

“My grandchildren will let me know,” Tango said.

Like he said, it’s only the third inning. They’ve got nothing but time.

More MLB coverage from Yahoo Sports:

What to Read Next