Homer Simpson put it best: “Pfffft. Facts are meaningless. You can use facts to prove anything that’s even remotely true. Facts schmacts.” (The Simpsons). With the proliferation of advanced analytics in professional sports punditry – this has never been more true.
Let’s be 100% transparent off the top. I am not an analytics expert. In fact, I am more on the abacus end of sophisticated math. With my bonafides on the table, let’s have a sabermetrics discussion. My friend Vaughan and I host a Winnipeg Jets podcast called “The Airport Lounge”. In between dated pop culture references and middling humour, we often try to blend analytics into our discussion. Why? The belief is that the more and varied ways to look at the game we love, and the team we love, the better. As novices, we see the incredible value of advanced analytics, but, also some of the pratfalls. What stats like Expected Goals, Corsi, Fenwick, axDiff, and WAR can accomplish is tell us what our eyes may or may not be seeing.
A Novice Look at Winnipeg Jets Advanced Analytics
Let us use Mark Scheifele as an example. I am on record lauding his play thus far this year. His tallies against the Blues and the Kings exhibited talent difficult to come by in the NHL. The defensive effort, while not always operating at peak efficiency, has been consistent. Some of my Scheifele grandstanding is based on last year’s performance, which, when used as a benchmark, compares him to his worst year as a Jet (as popular opinion would dictate). But what do those metrics say?
The chart below is 55’s advanced stats courtesy of MoneyPuck:
Corsi (CF%) | Fenwick (FF%) | Expected Goals (xG%) |
---|---|---|
A. 56% | 56% | 54% |
B. 54% | 55% | 48% |
Can we guess which of “A” or “B” represents his stats from this 2022-2023 season? You’ve seen this trick before….the answer is B. Statistically, Scheifele is performing worse this year than last.
So what is happening here? Our eyes are telling us one thing, but the statistics are saying something completely different.
Firstly, a common mistake with the use of analytics is the confidence in small sample sizes. Sumner Health Centre states, via their Law of Small Numbers article:
"The law of small numbers is a statistical quirk that is vitally important in the understanding and interpretation of data. In brief, it points out that when a sample size is small, small random changes have a large apparent effect on the analysis of the data."
For instance, over a 2-game span, statistically, Logan Stanley was the ‘best’ player on the Winnipeg Jets. Against the Leafs and Blues, Stanley led the team in Expected Goals (as an average over the 2 games) 5 vs. 5. He also played isolated minutes in a tertiary role. What then can we derive from this metric? Stanley had above average outings in 2 games relative to his teammates. It doesn’t mean that Stanley has turned a corner and will jettison up the depth chart, or that he is the ‘best’ Winnipeg Jet. It is just too small a sample size.
The other underlying issue with a number of advanced stats is sampling. Statistics are only as good as the data they are based on. Expected goals are explained below, via the Seattle Kraken’s website:
"In the broadest sense, Expected Goals (xG) is a measure that considers a variety of factors and then mathematically assigns a value to each shot attempt that represents the probability of that shot becoming a goal."
If a player has an xGF% of more than 50%, that means they are helping create an offensive advantage for their team as measured by shot quality. Well, who is assigning these values? Sites like Evolving-Hockey.com, NaturalStatTrick.com, MoneyPuck.com, and HockeyViz.com all do a great job tracking xG, but you will notice that the numbers vary slightly from site to site. A metric like “High-Danger Chances” is a perfect example of the subjectivity of an advanced stat. At the end of the day, human biases and error factor into the criteria of what constitutes a high-danger chance, regardless of diligence.
However, sometimes analytics confirm exactly what we are seeing on the ice. Nikolaj Ehlers is analytics catnip. His ability to drive play changes the flow of the game and the stats back that up. Connor-Scheifele-Ehlers (again, in limited minutes) have a 54% Corsi, but swap out Ehlers for Appleton and the Corsi drops to 44%.
Another example: Kyle Connor is struggling. Do the stats back that up? Resoundingly, yes. Connor ranks dead last in Goals Scored Above Expected at -4. This will regress back to his mean, but he has hit a startling number of chest protectors this year. With Ehlers out and Conner struggling, it is actually a small miracle the Jets sit at 5-3-1.
One of the bigger issues for experts and novices alike, is that advanced stats are simply misunderstood (present company included). I recently engaged in a Twitter discussion with a few fellow discerning Jets fans. It was in regards to the following graph courtesy of Hockeyviz.com and the interpretation thereof. If this graph looks like a Rorschach test to you, you are not alone. Let me be clear, I contributed nothing of value to the conversation.
Seemingly, the graph is saying that Connor and Scheifele have performed well defensively without Ehlers. It wasn’t until Micah Blake McCurdy himself joined the thread that I understood what was going on here. He explained, via Twitter:
In this humble writer’s opinion, analytics are a great tool for analyzing the game (even as I learn more). Like any tool, it has a specific use and is not tantamount to an Allen Key at IKEA (one size fits all). Mark Scheifele is very good, Kyle Connor is slumping, and Logan Stanley is tall. If you do the math, it makes sense.