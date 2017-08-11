Welcome to FC Yahoo’s Premier League preview week. We’ll take a look at each team in our aggregated predicted table, counting down from No. 20 to No. 1, and also reflect on some issues surrounding the league as kickoff approaches on Friday. Follow along with everything here.

It was exactly a month ago at an upscale hotel in central London that Danny Murphy — yes, the same Danny Murphy who used to patrol Premier League midfields for Liverpool and Fulham — sat on stage, in a slick white dress shirt, and in his new element. Murphy, now a popular BBC pundit, was part of a five-man panel at an Opta event discussing the future of data in soccer. And he was on a roll.

Almost 30 minutes in, he was animatedly explaining the value of using chances created, as opposed to assists, to judge playmaking ability when the panel’s moderator chimed in: “That’s relatively new,” he pointed out, referring to the chances created stat. Murphy responded: “Yeah, it is. It is. Exactly.” And he continued, his hands gesturing, his weight shifting side to side.

Not 30 seconds later, though, Murphy’s head snapped to his left. Duncan Alexander, Opta’s head of UK content, had interjected. “It’s interesting that you said that’s relatively new,” he began. “But back in ’96, Opta was collecting chances created data. It just took longer to reach the general public.”

OK, it didn’t just take longer. It took ages. Sometimes over a decade. But finally, it seems, the most basic of advanced stats are beginning to find their way into mainstream media coverage of the Premier League. Sky Sports is using per-90 metrics on TV broadcasts. Sky, the BBC, the Guardian, the Telegraph and others have run explainers on expected goals. Gradually, everyone is catching on.

Nothing you’ll see on a TV screen or in a Guardian article is eye-opening to anybody working inside a Premier League club. Opta, the sports data company headquartered in London, has been nurturing expected goals since last decade. Arsene Wenger has discussed it at news conferences. Wenger’s club, Arsenal, purchased an entire data company three years ago.

Now, several years after many clubs began integrating complex data analysis into scouting and recruitment, the same models and stats, albeit simplified versions of them, are bleeding out into the open.

“These things take a while to seep through to public consciousness,” Alexander said at the Opta event.

And now that they have, fans will be able to better understand the game they love.

Explaining and assessing expected goals

“Expected goals.” That’s the stat you’ll hear about most. It’s often abbreviated as xG. And, once you get comfortable with it, it’s a relatively simple concept.

Expected goals, on a game-by-game or full-season basis, is a measure that blends chance quantity with quality. In a way, it’s a beefed-up, more involved and more accurate version of pure shot totals. On an individual shot basis, it’s an intuitive numerical way to assess chance quality that is easy to digest.

Expected goals models, which are developed based on analysis of hundreds of thousands of shots from old games, assign a number, between 0.0 and 1.0, to every shot taken in a match. The number is the probability that the shot, based on a variety of factors, would result in a goal if taken by an average player. An xG value of 0.5 means there is a 50 percent chance the shot results in a goal. A value of 0.01 means there is a 1 percent chance.

How exactly are those probabilities calculated? If you really want to dive in, Michael Caley, an American writer who has been at the forefront of the stat’s development, explained his methodology here. But the short answer is that the value of a given shot is based on some combination of the following considerations:

Shot location. The farther away a player is from goal, the less likely he is to score. And the farther a player moves away from the center of the field — as his visible angle of the goal decreases — the less likely he is to score.

Shot type. A shot with the foot is more likely to result in a goal than a header.

Assist type. For example, a chance created by a through ball is more likely to yield a goal than a chance created by a cross or a long ball.

Take-ons. If a shot comes after an attacker has beaten a defender with a dribble, the shot is more likely to result in a goal.

Passage of play. For example, a direct free kick from 30 yards out is more likely to be scored than an open play shot from 30 yards out. In Opta’s model, this includes “open play, direct free kick, set play, corner kick, assisted, throw-in.”

Speed of attack. Shots at the end of attacks that cover a lot of ground in a short time (i.e. counterattacks) are generally more likely to result in goals than shots at the end of slow buildups. This is a feature of Caley’s model.

1-v-1s. They are, naturally, more likely to result in goals than other shots.

There is one subjective factor: Big Chances. Opta’s game coders are responsible for tagging some shots as “Big Chances,” or “chances where a player should reasonably be expected to score.” This is a way of accounting for factors, such as an empty net, that data alone can’t pick up.

(Note: There is not one all-powerful xG model that every statistician uses; Caley’s model, for example, is different than Opta’s, which is probably different than Arsenal’s. But each likely has its own way of accounting for all these factors, even if the categorizations or methods of calculation are different.)

The idea is not that a computer will spit out the exact probability of a goal being scored at the moment it leaves a striker’s foot. The idea is that xG models provide an as-good-as-possible estimate of that probability based on objective factors if a league-average player were on the end of the chance. Those estimates do not and should not serve as the be-all, end-all of soccer analysis, but they are particularly useful in a few main ways.