So last week, Bobby Gardiner Tweeted this:
Awful lot of people writing about analytics and very few doing anything at the moment. Good piece, but nothing new. https://t.co/eMZkIQuSIL
— Bobby (@BobbyGardiner) June 6, 2016
When I read it, I went a bit red from embarrassment.
After all, meta discussions about the use and application of analytics in football are my bread and butter, mostly because I think theory needs practice, and not all worthwhile analysis comes with a clear and realistic application at the club level.
But at some point it feels weird to keep talking about ‘analytics work’ in the abstract, particularly as there seems less and less of it kicking around online these days—which, to be clear, is an impression, not an empirical judgment.
And then I got thinking about Arsenal’s recent pursuit of Leicester City’s goal scoring, diamond-in-the-rough phenom Jamie Vardy. Though there hasn’t been voluminous writing on the subject from the soccer analytics community in response, Knutson’s assessment is, as ever, pretty good.
So, what if there was a way to think about the probability that Vardy will be a success at Arsenal without looking at his advanced stats?
Then I recalled this post on why forecasting could be a much more valuable and holistic approach to player recruitment than use of predictive metrics alone. I thought, maybe I could go through and scrape as much data as I could on playing minutes for newly recruited Arsenal strikers over the past fifteen years.
This skill, which is probably second nature for many of you, is, unfortunately for me at the moment, outside my wheelhouse. So I initially considered giving it another week while I figured it out.
But then I remembered Daniel Kahneman’s chapter in Thinking Fast & Slow on the work of Paul Meehl and Robyn Dawes, who authored studies which show how even low information, DIY algorithms do better than completely subjective ‘expert’ judgment in the predictions stakes:
The surprising success of equal-weighting schemes has an important practical implication: it is possible to develop useful algorithms without any prior statistical research. Simple equally weighted formulas based on existing statistics or on common sense are often very good predictors of significant outcomes.
And then I thought—what if I were a chairperson or a director of football who had to make a decision on a player in a relatively short period of time, for whatever reason? What if I had terrible math and software skills, but a reasonable understanding of basic statistical principles, and an ability to use the internet reasonably well?
In other words, what if I was me, but with a proper football job?
How would I go about deciding on the likelihood that Vardy would score, say, at least 15 goals for Arsenal next year, barring any injuries? I mean, at 29, Vardy’s not getting any younger, and at that price point, ideally you’d want a lot of production right away. So I’m comfortable with this prediction question.
What follows is me literally doing a back of the envelope calculation in real time, as I write this draft, so apologies if some of these numbers are wrong…that’s part of the point of this. Anyhoo, strap in!
Step 1: Take the Outside View
Let’s quote Kahneman again. When making a prediction, the Nobel prize laureate says the best method is take a Bayesian approach, which means finding the base rate for whichever category of thing you’re predicting. This is what Kahneman calls the ‘outside view’, “…the prediction you make about a case if you know nothing except the category to which it belongs.”
This outside view, says Kahneman, should be the ‘anchor’ for whatever prediction you end up making. So this initial figure gives you a starting point, from which you make adjustments.
I’m going to take my base rate for the percentage of Arsenal strikers who have scored 15+ goals in their first season, going back, say, to the 1999-2000 season. I’m going to exclude—for obvious reasons—strikers on loan or who returned from loan spells.
Some important caveats: When I woke up this morning and thought about how I would edit this post, I immediately realized I should have made my base rate the percentage of all Premier League strikers who scored 15+ goals in their first season over the last two decades or so, and made that my base rate. This would have provided a) a much bigger and robust sample, and b) allowed me to more accurately weight my prediction based on age, previous club, goal record etc.
However, I don’t have the skill or expertise yet to do this, nor did I have time to figure out how to scrape all that data and put it in an excel file, let alone actually manipulate it. Again, I’m sure for most of you this would take all of half an hour to do. Please do! I would be curious to see the result. But part of the point of this exercise is to show what can be done in a very short period of time by someone with very limited skill. So there we are.
Anyway, there were a few things I discovered in doing this, like—surprise!—Arsenal aren’t really into signing high producing marquee forwards at the peak of their career, roughly proxied by transfer fee. Of the 22 strikers they signed since 1999, only 6 or so went for over £10 million in the transfer market, though I didn’t adjust for inflation. One of them was Arshavin by the way. Another was Thierry Henry, who, well, was an outlier, to put it mildly.
And if you think the 15+ goal target is high, I may as well have made it 10 goals…production for first year strikers at Arsenal is generally barbell shaped, with the likes of Henry and Alexis Sanchez on one side, and most of the rest on the other.
Anyway, onto the base rate: 18% of Arsenal strikers signed since 1999-2000 scored 15 plus goals in their debut season.
Step 2: Adjust Base Rate by Size of Transfer Fee
That’s a low percentage, of course, but if you look only at strikers who signed for over £10 million, half of them scored more than 15 goals in a season. How many Arsenal strikers garnered a transfer fee over £10 million since 1999-2000? 27%.
It’s almost guaranteed that Vardy will be worth a transfer fee well in excess of £10m, so I scrambled for about 20 minutes trying to figure out how to use an online Bayes Theorem calculator, and managed to adjust my probability to 60%.
How did I do this? Well, the base rate is 18% for all Arsenal first year strikers who have scored more than 15 goals in their first season. Of that 18%, 75% cost more than £10 million in the transfer market. As for the non-15 goals a season strikers, only 11% earned a transfer fee in excess of £10 million (market efficiency for the win?). You just punch in these numbers on one of these websites and…voila! Adjusted odds!
Now, I may have done this completely incorrectly, so let me know. But for now Vardy looks to have reasonably good odds of hitting 15 goals.
Step 3: Age
From here on in, I’m eyeballing the weights because I’m working with a pitiful sample size. Vardy’s 29 years old…the second oldest after Jermain Defoe on the list of 15+ goal scorers this season. Most strikers peak a little earlier, and for Arsenal striker signings this is pretty old, so I’m dropping the odds a little to 56%. Too much? Too little?
Step 4: Previous Team Quality
I happen to know from my many travels that teams are better off signing players from equally good or better teams. So normally, this would mean a buy like this is suspect. But a quick check at the excellent clubelo.com reveals Leicester City are just behind Arsenal in Elo rankings. So, these teams, maybe not that different right now, you know? So I’ll leave that 56% as is.
Step 5: Consistency in Goal Production
Keeping in mind that my overall aim here is to make a judgment in the absence of any advanced stats, or really any in-depth knowledge at all, I took a brief glance at Vardy’s wiki stats (yes, wiki!) shows that the dude has run very hot and cold every other year in the scoring department since the NPL Premier Division days in 2010. That’s kind of…concerning?
Now, you could make the argument that his fortunes depended on those of the team around him, I suppose, and so therefore a team with Arsenal’s consistency should provide more goal scoring opportunities. But this is a big question mark…whether it’s enough to bring down the probability further, I don’t know. But right now I would peg it at 53%, just to be a smart ass.
Now, here’s the thing—53% is a terrible figure to end up with. That’s the equivalent of. “Maybe he’ll score 15 plus goals, maybe he won’t, time will tell!” But I think for most strikers on the market, a 50/50 possibility they’ll bag over 15 goals in their first season are decent odds. If someone said you could make a coin flip and get a 15 plus goal scorer in the transfer market, you’d be silly not to go for it.
But this is a back-of-the-envelope calculation I made in less than an hour literally with pen and paper and transfermarkt.co.uk based on a pitifully small sample—though generally it seems when Arsenal pays a lot for a striker, they get a good return. Moreover, I included zero ‘advanced stats’ here. This probably more precise than me just speculating whether Vardy will be a “good fit tactically”, but not by much.
The point, however, isn’t to be accurate, but to hopefully demonstrate that even in the absence of data or even mathematical skill, you can still attempt to do something that goes beyond the tired ‘expert pundit bloviating/soulless big data analytics’ dichotomy we continue to apply to sports predictions—which, to be frank, is a boring and misleading pile of horse shit—and at least try to do something interesting.
In fact, one of the goals of this site over the next season is to showcase ways in which older folks like myself with only a nominal background in mathematics might improve their skill in this field, through self-education and experimentation.
If Front Office Report does reasonably well in the subscriptions department when it launches in earnest later this summer, I plan to spend a bit of time learning basic statistical math, and from there, who knows? Maybe some data science, too.
I want to track my progress, to show others that it can be done (perhaps). Maybe not with the goal of becoming a fully fledged professional, but at least to be a better communicator.