After several weeks of high falutin’, helicopter talk about the capital ‘S’ State of Analytics, this week I want to focus on something much more minor, and by dedicating a long post to them, clear Expected Goals chat off the decks for a while as there are some more interesting work to discuss.
I think too often in football analytics event data gets treated as if it’s on some sort of different plane than the lived, ninety minute games themselves. If football was a little more like hockey, perhaps, with eighty game seasons and loads and loads of shots per 60 minutes, I might understand that approach. But both the shot counts and, obviously, the goals in football are not so high as to dissuade us from taking a look, no matter how time consuming (though boy it would be nice to see something like the NBA’s ‘video boxscore’ tool).
The problem is that by not checking the tape, we risk oversimplification.
In the worse case scenario—and at risk of stuffing up a straw man—that could mean something like comparing a single game final score tally to an xG tally, and saying without any more added information, “Oh, so and so team should have won.”
This is an extreme example, but even when you bring in the idea of looking at xGs as revealing a range of probable outcomes for single games (as per Danny Page, not that he would advocate that necessarily), I think we’re risking reducing football to something that, well, doesn’t resemble football.
To demonstrate why, let’s look at some (possibly illegal?) screencaps from Arsenal’s disappointing 1-1 draw with Crystal Palace last weekend, and compare them to data from Paul Riley’s Shot on Target xG tableau tool. Keep in mind that more than one proprietary xG model I’ve looked at—thanks to all those who clandestinely sent them over to me—had Arsenal beating CP 2-1 as the most probable outcome based on shot quality.
I don’t want to do a thorough shot-by-shot analysis because that would be both time consuming for me and boring for you.
Instead, let’s focus in on three shots in particular—two of Arsenal’s highest individual xG shots, and Crystal Palace’s goal courtesy of Yannick Bolasie.
First we have Danny Welbeck’s shot from the six yard box. As a dot on Riley’s tableau, it looks positively deadly, and indeed, more than half of the players who have attempted on-target shots from that area of the pitch have scored. Danny didn’t, and while a GIF might be better here, it’s not hard to see why from the still:
Prior to this shot, the ball had bobbled around the box, and Welbeck had to go into a sort of half-bicycle kick to make contact in a fairly confined space. Palace keeper Wayne Hennessey had the angle pretty well covered, and while this was a decent chance you wouldn’t characterize it as a terrible miss, nor would you say Welbeck “finished poorly”—it’s entirely subjective but I think he did the best he could. Though the xG for on target shots from this area is 0.5547, I think if this particular chance with all its unique qualities played out ten thousand times over, Welbeck would probably score far less often than that.
Then, we have Olivier Giroud’s slightly less dangerous chance in the 76th minute.
Here again the same principle applies: Giroud slightly deflects an angled cross from Ozil while in the air with two opposition defenders right close to him. His body shape and angle toward goal mean any shot that came off his foot would almost certainly go into Hennessey’s waiting hands. And it did. The end.
Finally: Bolasie’s equalizer for Crystal Palace:
Even though the xG here is roughly one goal in every ten shots, there are some extenuating circumstances. Bolasie as you can see had a lot of room to tee up, which may have improved his accuracy than if had he bashed a ball that had rebounded out of the box, something that happens a lot in football. But Cech also failed to parry properly too, and it ended up in the net. Insert ASCII shrug here.
One might think here, “Well, all you need is a model that adds in these extenuating circumstances,” but even if you did, you’d be working with a pitifully small, wildly variable sample. I imagine calculating minutiae like body shape or goalkeeper angle would provide some diminishing returns to any xG model pretty quick.
And this doesn’t matter for how xG is used most of the time anyway. Riley’s model is still predictive of goal difference at large—you get a higher xG count for and less against, and you will likely score more goals.
The reason I’m doing this is to get across a couple of key points.
The first is that xGs aren’t really “high or low probability chances augmented or diminished by ‘finishing.’” There wasn’t much wrong with Welbeck’s or Giroud’s finishing against Crystal Palace from what are, on average, high conversion areas; to me at least, they did what they could based on the highly specific nature of the chances provided to them. There are of course geniuses who might up the numbers on converting shots like these, but they’re few and far between. Of course there are instances where poor/excellent finishing makes the difference, but, I would suspect these happen far less often than we might think.
I also want to use the above examples to dispel the idea that “random variation’—the amorphous ‘stuff’ that moves the needle in favour or away from a goal on any given xG shot—is always made up entirely of bounces or the wind or whatever. Often, ‘variation’ involves real, deliberate, human actions on the football pitch, even actions with demonstrably positive/negative outcomes.
Here’s the thing, though: just because actions were intentional doesn’t necessarily make them “repeatable.” Maybe one or two Arsenal players might have done better in the build-up to these particular chances; a short pass to keep position rather than a speculative cross might have been the best thing for Giroud’s shot, for example.
This is a very subjective call, however; there often isn’t some clear dividing line between signal and noise with these things. An experienced PA may notice clear, negative patterns that may be affecting Arsenal’s ability to convert dangerous chances. Or it may just be noise, a single game, and a few chances were Arsenal players did the best they could. The only way to know with slightly more confidence however is to check the tape.
The same goes for Arsenal’s defending on Bolasie’s goal. In isolation, a PA might look at that shot and think, “They should have closed him down.” But unless you’re noticing a higher than normal conversion rate from these areas as a rule, something backed up by clear, sloppy patterns in defense, it’s probably not worth doing much more than giving players a stern reminder to do their effing jobs (and checking to see if Cech is okay).
This isn’t to take away from xG, but to put its purpose(s) in context. Expected Goals are not about goals that teams “should or should not have” scored, nor are they a prescription to shoot exclusively from that red blob in the middle of the 6 yard box no matter what.
Rather, xG measures what tends to happen when players take certain types of shots from certain areas of the pitch. More broadly, they detail how good teams are at getting into advantageous positions, and how good they are at preventing teams from getting into advantageous positions, and these qualities tend to be more predictive of long-term performance than goals or points in the table.
That doesn’t mean a team has to have a high xGD to win; see Leicester City. It’s just gives you a pretty decent clue about the relative, underlying strength of your team. But I can think of no compelling reason why reviewing game tape can’t help a coach or a team better understand whether under or over performance in relation to xG is as truly random as the data alone suggests. It may even be a guide in understanding when to bend/break the “rules” about shooting from high conversion areas.
You don’t need to be rich
So I don’t think we need too much more evidence that Leicester City are wildly over-performing this season and are probably set for regression either this season or next. From Michael Bertin looking at raw shot conversion numbers to Howard Hamilton noting the Pythagorean table has the Foxes 11 points above where they should be, by any metric, Leicester is riding their luck.
But this isn’t just a story about luck, as I pointed out a few weeks ago. It’s also about Leicester’s solid planning, preparation, smart management, etc. And they are not alone in this, either, as Michael Caley pointed out in a recent Washington Post column:
This analysis suggests that Leicester City is not so inexplicable after all, but rather the latest in a line of well-run smaller clubs to perform at a high level in the Premier League. David Moyes’s Everton consistently finished in the top half despite never paying the wages of the big sides, and Southampton has recently overtaken the Merseyside club as England’s model organization. Leicester looks relatively similar to the Saints and Toffees in its underlying numbers. The Foxes have further captured lightning in a bottle with their finishing rates (and especially with opposition finishing) while benefiting from down years by many of the traditional elite.
The lesson of this incredible season is not peculiar to Leicester. Good resource management, smart acquisitions and quality tactics can elevate a smaller club to competitiveness. The examples of Everton and Southampton show this, as well. Once you are competitive, you are in position to take advantage of a run of good fortune should it hit. And Leicester City has taken that advantage nearly all the way to a league title.
I think the idea that a smaller club could become the next Man City based on smart management alone is far-fetched, but solid planning, smart recruitment and sensible player development can all help smaller teams perform consistently enough so they can take advantage when the moment comes.
The question mark in all this might be Everton, though despite their terrible recent form they’re not exactly in danger of relegation.The stupid thing would be to replace Martinez in a fan and media driven panic, rather than use all the information at their disposal to assess the problem, pinpoint some of the common cause, and take action—and yes, that might mean replacing Martinez.