I am convinced that once tracking data becomes (more) available in football, we’ll be in for 2-5 years of absolute chaos before we understand how to use it. There will be wailing and gnashing of teeth and absolute dogshit visualisations everywhere.
Marek Kwiatkowski, New Kind of Analytics Inc.
Tracking data wont effect football decision making for two decades, if ever.
Paul Riley, Brand Excel
Now you have two choices. One, ignore the mismatch, don’t tell anyone and hope it’s never mentioned again. Two, investigate further.
Visually not specific significant directional patterns can be identified. The error is a scary. In an xG model, two shots with a 100cm positional change could result in a large variance of xG values.
Lets see if there are any patterns in the direction of error in the subset of shots.
Once again there are no visually identified patterns in the direction of error. There appears to be a generalised error. My gut reaction would be this could have a large impact on Expected Goals (xG) values.
Impacts on xG
Let’s calculate xG values for all 3,751 shots based on their event data and tracking data x,y positions. Ben Torvaney’s xG model is used (there are more accurate models but this is the one I have access to). Comparing the difference in xG between the two locations will give us an initial glimpse at the size of the problem.
The maximum shift in xG values was -0.3770, which is a significant change in value and a real issue as a single figure….
A quick peak at the distribution…
Look at that green middle-finger to my hypothesis!
So the easy-defence team wins? … “yeah but with enough data the errors will be drain away via insignificant runoff!” … let’s see…
The mean difference distance in xG -0.0081 whilst the median was 0.0004. So it’s true, when aggregating over a season the potential error in xG values becomes insignificant.
However.. what about at a player level?
Let’s get rid of all players that haven’t taken 10 shots or more and then rank the players by biggest mean xG difference. Here’s the top 10.
Once again the xG difference is almost insignificant and would not significantly impact the use of season long xG values when filtering and comparing players to recruit.
So… what about at per match basis?
We could use the mean distance difference for all shots of 216cm and do some simulations. A more reflective methodology would be to probabilistically generate distance differences based on their frequencies within dataset.
The probability distribution shows clear patterns:
So… Let’s simulate each game 1,880 times using the following steps for each shot within that game:
- Generate a ‘distance error’ based on the above probability distribution.
- Chose a random spot that is is same distance away from the original event as the ‘distance error’ yet on the pitch!
- Calculate the new xG value for the simulated x,y position.
For each match simulation the original and simulated xG values can be summed and compared to better understand the impact on single match xG analysis.
Let’s study the probability of various swings in xG per team. We are more interested in the extent of the swing rather than the direction of the swing so all negatives values have been converted to their additive inverse.
It’s positive to see that there is a 19.8% chance that there will be a swing of less than 0.1 xG per team per match. However, it’s concerning to see that there is a 5% chance that there will be a swing of more than 1 xG per team per match!
The potential of large swings makes you think that many single match xG results are wrong! What % of simulated games showed a reversal of xG match result? 6.7%!
Although coaches utilise long-term trends, they have a large thirst for the data of singular matches. Therefore we have to ask ourselves are we comfortable with the dataset’s inherent error when presenting results at this level of granularity? If we are not, how do we present this margin or error/uncertainty to coaches?
Our audience may be skeptical and probabilities can be intuitively misread, so it would be easy to hide our error/uncertainty in a cupboard. However, I believe this only provides skeptics with ammunition to shoot us in the head if that error/uncertainty ever trips us up. Others will strongly disagree. There are good examples emerging within the media.
I would love to see more data visualisations within our small community that develop our competencies to discuss error and uncertainty with our audiences. Showing some weakness just might strength our relationships and influence with our audience.
** Disclaimer – TRACAB and Opta data are not used in this article **