We’ve all been there.
You’re sitting and watching a game and a particularly egregious officiating call is made (probably against the team you’re rooting for). And not for the first time you wonder — loudly — why we don’t replace these meat puppets with computers.
“We have GPS chips everywhere now! Why are we relying on old dudes1 to spot the ball?”
or
“My car can parallel park itself for fuck’s sake. There has to be a better way than watching these clowns putter around and pretend like they saw where a guy’s knee went down!”
I can sympathize. And you might expect a data guy like me to be the first person to advocate for a more objective way of officiating America’s favorite sport. But, nah. Why? I’ve worked with the data.
Consider the simplest task we could give these chips: Telling us where on the field a stationary ball is. Now imagine the situation during a game where it would be almost impossible to fuck things up. Say, right after a touchback. We know for certain where the ball should be on the first play after one of those: the 25. How well do the chips do then?
Oof.
Over the 8842 of these plays for which I have data2 from 2017 to 2022, the mean absolute error is right around two feet. But that’s just the average. The maximum recorded error is just over two yards.
When I started working with tracking data back in 2019, it was understood that there were problems with the chips when the ball was thrown and spinning. Tracking locations in those conditions gave the sensors a rough time, it was a known issue, and work was being done to fix things. I naively assumed that a stationary location reading was probably pretty good. Clearly it’s not — but perhaps ball location data has improved over time?
Things have improved, somewhat. But an average error of 22 inches is still wildly inadequate. However bad things are with human refs, 22 inches is probably an order of magnitude worse than what we’ve currently got.
We need to keep the hoomans.
Caveats and Discussion
There are a couple potential problems with this analysis that could explain some of the error in the data.
First of all, I used ball location just before the snap. That means that the Center has picked up the ball and put it on the ground, nose first. That displacement introduces some error.
It’s also possible that the timing is off. In other words, the ball could be snapped and on its way back to the QB — not stationary — for some of these observations. The fact that most of the error is behind the 25 yard line lends some credence to this concern.
I readily concede that it would be better to use the ball location after the ball has been spotted but before it’s touched — like when the offense is in the huddle. At that moment there is no chance of these types of errors occurring.
Unfortunately I do not have that data, or I would have loved to use it. However I can report that there are analysts that do have that data and did exactly that, and their findings are close to mine.
As another check, we can look at the lateral error on these measurements, sideline-to-sideline. While we can’t know for certain where the ball was placed between the hashes3, we do know where it can’t be located: outside the hashes.4 And lateral ball location should not be affected by a timing mismatch unless the timings are wildly off5.
Yet the picture here is somehow worse. Since 2017, 21.5 percent of all observations are impossible. And the share of impossible locations has only grown.
Mostly — there are female refs in the NFL now too.
There is some missingness in the data, but for touchbacks I can’t tell myself a convincing story about why it might matter.
After touchbacks the receiving team gets to choose where between the hashes to place the ball.
The ball has to be placed between, not on, the hashes.
This is unlikely. Resolution is 1/10th of a second.
How did you get data for all of these years for public use?