December 11, 2009
The world of advanced baseball statistics can be an intimidating place for those of us who slept our way through advanced algebra or haven't been a follower of the Bill James revolution from the beginning.
Still, that doesn't mean that we should feel left out when it comes to another way of understanding and appreciating the game we all love. With that in mind, BLS stat doctor Alex Remington will explore a new advanced statistic each week during the offseason, providing a simple primer for the uninitiated.
Today's statistic: wOBA
What it stands for: Weighted On Base Average, as devised by Tom Tango
How they calculate wOBA: Hoo boy! Let's first start with the theory. First of all, it's called wOBA because the number is scaled to look like OBP — which is also sometimes called OBA — not because it actually is a variant of OBP. In this regard, it's like Tom Tango's FIP, which is scaled to look like ERA. (Many advanced metrics are scaled to look like stats we're more familiar with. Baseball Prospectus's EqA, or Equivalent Average, is another all-purpose offensive stat that's scaled to look like Batting Average.)
wOBA is scaled to OBP for two reasons. First, we're all familiar enough with OBP to be able to eyeball and tell a good percentage from a bad one. Second, it means that league OBP is defined to equal league wOBA.
As Dave Cameron of FanGraphs writes, wOBA is useful "when you just want to know how a batter did at the plate, regardless of who was on base or what the score was at the time." It's basically a variant of linear weight formulas, which George Lindsey and Pete Palmer helped develop. Linear weights attempt to properly value a player's contributions at bat by weighting each possible outcome (walk, home run, single, double, etc.) with regard to the number of additional runs that player's team can expect to score as a result.
For example, home runs have a run value well over one because runners can be on base, and singles have a value slightly higher than walks, because singles frequently move baserunners from first to third or from second to home. The exact coefficients are determined by an analysis of game data. They are, therefore, relatively precise averages.
The run value of each outcome is compared to the run value of an out, which is defined as zero. (We're ignoring the notion of "productive" outs, because we're only concentrating on the person at bat.) The values are then multiplied by 15 percent to scale wOBA so that average wOBA is defined as equal to league OBP. As Tom Tango says: "In other words, an average hitter is around 0.340 or so, a great hitter is 0.400 or higher, and a poor hitter would be under 0.300."
The actual formula looks something like this:
((0.72 x NIBB) + (0.75 x HBP) + (0.90 x 1B) + (0.92 x RBOE) + (1.24 x 2B) + (1.56 x 3B) + (1.95 x HR) / PA
(NIBB stands for unintentional walks, because batters have no control over intentional walks. RBOE stands for reached base on error.)
What wOBA is good for: As Cameron explains, wOBA is nice for two reasons. First, it combines the insights of OBP (how good a batter is at getting on base) with the insights of SLG (how good a batter is at hitting for power), but it also properly weights each outcome, and is therefore superior to non-weighted stats like OPS and OPS+. Simply put, it's a better method to see who's the best hitter.
The second nice thing about wOBA is this: Since it's calculated from the products of run values, it's relatively easy to determine the run value above the league average for each plate appearance — just subtract the league average wOBA (which equals league OBP), then divide by the 15 percent scale factor that we used above.
Runs above average per PA = (player's wOBA - league wOBA) / 1.15
What does this mean? Well, if a player has a wOBA of .353, and league OBP is .330 that year, the player has produced an average of 0.02 runs above average per plate appearance. If that player had 600 PA that year, he produced 12 runs above league average.
As you might expect, league MVPs Albert Pujols(notes) (.449) and Joe Mauer(notes) (.438) led the majors in wOBA by a considerable margin. Hitless wonders Yuniesky Betancourt(notes) (.271) and Emilio Bonifacio(notes) (.277) were the worst.
Using the formula above, we can determine the approximate number of runs above average each player produced. (This stat is also known as wRAA and is tracked on FanGraphs. My calculation below is approximate because the scale factor of 1.15 can change from year to year.) Since the league OBP in the NL was .331, and the league OBP in the AL was .336, that means that Pujols produced approximately 70 runs above average while Mauer's output was approximately 54 above average. Bonifacio and Betancourt, meanwhile, were both more than 20 runs below average."
How wOBA works: wOBA goes through reams of data to determine proper coefficients and then spits out a nice, clean number. I know this seems complicated, but there's just no way to do it on the back of an envelope. As with OPS+ and nearly all other advanced stats, you basically can't calculate it on your own without a dedicated database and a reliable data stream. Here is an example of the SQL script that Colin Wyers of Hardball Times used to compare wOBA with EqA. Tom Tango notes that a player's wOBA will converge on his OBP if he tends to slug and get on base in league-average proportions. If a player gets on base a lot but doesn't hit for much power, his OBP will be higher than his wOBA; if he hits for a lot of power but doesn't get on base a lot, his OBP will be lower than his wOBA.
The simple version above will produce a generalized wOBA, which ignores year-to-year fluctuations in run environment (and whose precision is limited to two decimal places). If you're a database pro, you can calculate your own, but I recommend going to a site that makes wOBA easily accessible, like FanGraphs.
When wOBA doesn't work: There's an ongoing, albeit mild, debate on the Internet about whether wOBA is better than EqA or vice versa. It mostly seems to be carried on by partisans of Tom Tango and Baseball Prospectus, each of whom has a substantial and substantially overlapping fanbase. Baseball Prospectus's Christina Kahrl believes that EqA is more accurate, but Colin Wyers tested the two and was unable to find a significant difference. The simple answer is that both are good.
wOBA is a good way to analyze an individual hitter, but it isn't a sufficient tool to analyze how much a player contributed to his team's wins. It evaluates outcomes, but not situational outcomes — it doesn't care who's on base, or how many outs there are, or what the score is, or what inning it is. Another Tom Tango stat, WPA (Win Probability Added — next week's lesson) measures the amount by which each play in a game makes a victory more or less likely. For the purposes of WPA, a homer in the bottom of the ninth is worth more than a homer in the bottom of the first, and a homer when you're up by 10 is worth less than when you're down by one. In wOBA, all homers are equally valuable. This isn't a critique, as wOBA wasn't designed for win expectancy, it's just a point worth noting.
Also, the individual coefficients, because they're fixed, aren't completely precise: as Tango writes at one point, "The run value of the stolen bases is fixed at .20, and the caught stealing is set with a bit of a fudge, but works fairly well."
Why we care about wOBA: As long as you have access to an Internet connection, there's just no reason to use OPS ever again.
Next week's lesson: WPA