Eyes in the sky: Optical tracking data has us asking better questions, but answers remain elusive

On Friday afternoon, during the first day of panels at the 2013 MIT Sloan Sports Analytics Conference, NYU-Polytechnic Institute professor Dr. Philip Z. Maymin presented a research paper that looked, in part, at how often individual players speed up and slow down while playing, and where they most often do so. Maymin pored over more than 30,000 half-court set plays captured by STATS' SportVU optical tracking technology (which I've written about before) in the hopes of creating "a suitable dynamic language" that would make it easier to talk about the kinds of things that static Xs and Os on a chalkboard can't really display — the best spots for certain players to begin possessions, the trajectories they should take as they move, how they should time their cuts/screens/drives, etc.

There was some interesting stuff in Maymin's talk — players tend to accelerate or decelerate most rapidly and most often in the paint, at the top of the key and past the elbows; centers tend to exhibit bursts of acceleration more than other positions because they're screening and rolling or popping a lot; Paul Pierce accelerates more often than you'd think for a dude who seems to do everything in slow motion. But it still felt quite a ways away from a usable framework for defining basketball plays that would enable us to really measure half-court execution. When the presentation ended, John Hollinger (formerly of ESPN, now with the Memphis Grizzlies) offered his estimation:

To some extent, that's the case for just about everything presented at Sloan. Asked during the conference's opening panel how far along teams are in translating collected information into action on the court, Houston Rockets general manager/Sloan co-founder Daryl Morey offered a sobering assessment: "We're nowhere yet."

Nowhere might be a bit of a stretch — if nothing else, the info gleaned from studying the advanced statistics and on-court habits of highly efficient offensive players and teams seems to have helped shape the philosophy of a Rockets squad that ranks as the league's fourth most potent in terms of points scored per possession. But one of the most common recurring themes during the two-day conference in Boston was that, for all the work that's been done to compile more and more information, there's still an amazingly long way to go in the journey to actual actionable decision-making help.

That's especially true for work based on the SportVU data. Because while the special six-camera system tracks an insane amount of data — the movement of 10 players, three referees and the ball in 25 frames-per-second high-definition video, which translates into about 1 million raw data records per game — it's still not enough.

"I don't know that we've got enough data to draw any significant conclusions," San Antonio Spurs general manager R.C. Buford said during Saturday afternoon's headline basketball analytics panel.

Why not? For starters, only 15 of the 30 NBA teams have purchased and installed the SportVU system in their arenas — the New York Knicks, Toronto Raptors, Washington Wizards, Golden State Warriors, Houston Rockets, San Antonio Spurs, Boston Celtics, Milwaukee Bucks, Oklahoma City Thunder, Minnesota Timberwolves, Philadelphia 76ers, Phoenix Suns, Orlando Magic, Dallas Mavericks and Cleveland Cavaliers. That means half the league's teams and players only get tracked during road games, so there's way less information on them in the system to analyze, which necessarily skews results and makes apples-to-apples comparisons within the data set really difficult, if not impossible.

For another, two of the three papers that centered on the SportVU data set — Maymin's look at acceleration and a four-author MIT consideration of whether it's more beneficial for teams to send an extra man to the offensive glass or retreat immediately on defense when a shot goes up — were based on information culled from 233 regular and postseason games played during the lockout-shortened 2011-12 season. (The third, the examination of interior defense co-written by Kirk Goldsberry and Eric Weiss, included numbers from this season.)

That sounds like a lot, until you remember that 30 teams played 66 regular-season games each last season; add in the 84 total postseason games, and you're talking about a data set that only encompasses a little over 11 percent of all the NBA basketball that was actually played. Would you feel comfortable making iron-clad statements about teams, players and patterns just nine games into an 82-game season? Probably not.

And then you remember that the post-lockout sprint involved virtually no training camp or preseason preparation, even fewer opportunities for in-season practice and development work than usual, unbalanced schedules featuring back-to-back-to-back stretches for every team, and less accurate shooting (on 2-points, 3-pointers and free throws) and scoring (in both average points per game and points scored per possession, which adjusts for the different paces at which teams play) than any of the prior five full seasons, according to Hoopdata's numbers. Given how weird and irregular last season was, you start to wonder how much you can really rely on this particular data set in the first place.

To their credit, Maymin (who knows he isn't close to the formal "language" of plays he's looking to create) and the other panelists acknowledged the limitations not only in the sample with which they were working, but also the scope of their respective projects.

The authors of the "crash or retreat?" paper found that sending two players to the glass and one back on D rather than one to rebound and two to retreat produced a net gain of 0.17 points per possession; put another way, in a game in which your team missed 25 jump shots, the simple strategic change of switching one player's responsibility when a shot goes up could net your offense an extra four points. Considering how many close games we see night in and night out in the NBA, that could be a huge deal.

But their study looked solely at missed shots and what those misses led to, without accounting for stuff that would seem to have a lot of impact on the likelihood of grabbing an offensive rebound, like the personnel on the floor during the play, where the shot was taken and the position of the shooter relative to defenders. It also didn’t consider game situation (time, score, whether either team's in foul trouble, etc.), which would probably play into coaches' and players' decision-making processes about whether they should attack or defend.

Similarly, Goldsberry and Weiss' spatial analysis of the SportVU information made a compelling argument for Milwaukee Bucks center Larry Sanders being the league's premier interior defender (more on that later) and introduced new means of measuring how effective NBA big men are at holding opposing shooters to low percentages inside, deterring them from even attempting shots near the basket and getting close enough to a field-goal attempt to contest it. But the authors didn't correlate their findings to overall defensive efficiency (do teams with strong interior defenders tend to allow fewer points per possession?) or winning (do teams with players who prevent interior attempts/hold opponents to lower interior field goal percentages tend to win more than others?).

Plus, basketball is a five-man game; with the exceptions of pure one-on-one isolation plays, virtually every individual defensive action depends heavily on the action of at least one (and likely more) teammate earlier in the possession. Team communication, defensive systems, assignments within the context of a scheme ... that's all stuff the SportVU optical tracking data, and subsequent analysis of it, can't quite get at yet.

That said, even though these projects are necessarily imperfect and incomplete, they're still important, because they represent a jumping-off point for asking more and better questions.

The "crash or retreat" authors' work could lead, down the line, to better understanding of the pure point value of the tradeoffs teams make in deciding whether to attack the glass or hustle back immediately. Goldsberry and Weiss' work could lead to more a granular and nuanced grasp of individual and team defense, which have long been perhaps the least understood elements of basketball. Maymin's work could lead to a fresh new method of quantifying in-play execution, which is just about always the division between good NBA teams and bad ones. And a million other things that we don't know about, because those 15 teams jealously guard their specific applications of SportVU data, could become common knowledge in the years ahead as those eyes in the sky record more games.

There's gold to be found here; but first, people have to mine it, even if they're only armed with pickaxes rather than blast hole drills.

"Ten years from now, we're going to look back and laugh at this study," Goldsberry said after his presentation. "But we have to start somewhere."

There's a reason the number of teams who have invested in SportVU (which reportedly costs between $75,000 and $100,000 per season) has risen year after year, and a reason why Indiana Pacers general manager Kevin Pritchard — whose team doesn't yet have the cameras — thinks they're about to become ubiquitous.

"It's coming — I don't think there's any doubt that one day all 30 teams will have the data capture [technology]," Pritchard said during Saturday's basketball analytics panel. "[The information] might not always get disseminated, but it's coming."

And it might be coming sooner than we thought.

University of Southern California researchers Rajiv Maheswaran, Yu-Han Chang, Aaron Henehan and Samantha Danesis won the prize for the top research paper at the 2012 Sloan conference for using SportVU data to track missed shots as they descended off the rim, find out who grabbed boards, where they grabbed them and what all that could tell us about offensive rebounding. (You can watch that presentation here.) This year, Maheswaran returned to showcase what they've been doing since — namely, working on closing the distance between collecting data and extracting wisdom.

"The future is how we bridge this gap," Maheswaran said.

To hear him tell it, the bridge is called Eagle — a database he and his colleagues have been developing that processes all the optical tracking data SportVU spits out into quickly searchable lists with a ton of different filters and queries, creates quick and interactive visualizations of the information (like shot charts and heat maps) and more.

With Eagle, according to Maheswaran, not only can you look at every shot taken in the NBA over the past two years; you can get more detailed information on how likely any shot is to go in, how valuable specific types of shots from specific areas are in relation to other types, how much an individual defender matters relative to other defenders, how shot accuracy is affected by defender distance, how/where/why players move in time during the course of a possession ... and that's just for starters.

What makes Eagle an important advancement in the optical data field, according to Maheswaran, is that it makes the experience of combing through all the SportVU location data exponentially quicker and easier, even for non-geniuses. Want to find out who ranks amont the league's most efficient scorers close to the rim, off the dribble, with a defender within 1 feet of them? In a matter of seconds, Maheswaran can reach into the system and pull out a top-10 list, with Knicks center Tyson Chandler up top.

"What you really want to do is leverage what people do well and what machines do well, and use them together," Maheswaran said. "We're right at the tipping point of […] where people who are not experts at computer science will be able to do this."

One big part of that process will be lining up the spatial/location-based information that SportVU collects with video analogues of possessions and individual plays — the kind of stuff available through Synergy Sports Technology's play-tracking data. That could help introduce some human context into all those x,y coordinates, which Goldsberry noted as one of the biggest drawbacks of the SportVU info ("There's a lot of work to do in, how do we actually represent the human being?"). According to Maheswaran, Eagle doesn't have that capability yet and won't in the immediate future, but he says that connecting the video and optical data isn't that far away: "It's not like growing a third ear or anything."

Whether Eagle's ability to quickly process SportVU data represents as big a leap forward as it seems remains to be seen; it's a good bet that the teams who've purchased SportVU — especially the ones who've had it for a while, like the Rockets and Celtics — have already made significant headway in terms of how they process, slice up and deconstruct all that data, even if they can't totally trust what it's telling them just yet. But for the other half of the league that's been tentative about getting into the optical tracking game, whether due to the costs of the system, the daunting task of figuring out how to find valuable information amid all those coordinates or both, the introduction of a software system that can expedite the process could be just what the doctor ordered.

And if teams are able to speed up the trip from data collection through data processing to in-game application, those of us in the cheap seats might start finding out what stuff like Maymin's acceleration paper actually means a whole lot sooner, too.

More from the 2013 MIT Sloan Sports Analytics Conference:

Needles, haystacks and failures to communicate: The challenges of advanced stats
Next big thing in NBA analytics? Moving from what we can see to what we can't
Sloan wrap-up: Loving Larry Sanders, Stan Van Gundy's secretly an analytics guy and more