HOCK.ly - Future of Hockey Content 2013-2014 Season Preview | Page 60

Corsi

Next we'll look at one of the most important new statistics, Corsi. Don't let the name fool you, it's actually just a team's attempted shot differential. Named after Buffalo's goalie coach Jim Corsi, this statistic is the difference between all the shots attempted between the two teams, included those that hit the post, missed the net or were blocks. In the case of teams it is often presented as a percentage, since outshooting a team 30 to 20 (60%) is a lot different than outshooting them 50-40 (55.6%)

Why is Corsi so important? First of all, a team that is taking 60% of all attempted shots has been shown to be in possession of the puck 60% of the time, too. Secondly, while things like shooting and save percentage can bounce up and down from one game, week, month or even season to the next, a team's rate of puck possession tends to be predictable and persistent. That's why a team's Corsi percentage actually does a better job at predicting a team's record for the following season than the actual standings themselves.

Other statistics can be used as proxies for puck possession, of course, including regular shots, scoring chances, a stopwatch recording of time in zone, or even the goals themselves. Corsi is preferred because of the availability of the data, the lack of any human opinion in its measurement, and because the higher volume of events will lead to an accurate picture of puck possession faster than the alternatives.

Corsi led to a statistic called Fenwick, which is the exact same thing as Corsi but without counting blocked shots. Fenwick percentage has been shown to be even more persistent, and even more predictive, especially among teams who have a demonstrated and committed ability to block shots, like the New York Rangers for example.

Here is the close game Fenwick percentage (CGF%) leader board for the 2013 NHL season, which is available at websites like Behind the Net and Hockey Analysis. To remove the skewing effect special teams can have on attempted shots, it includes five-on-five manpower situations only. Furthermore, it includes only situations where the game was close, in order to remove the skewing effect caused by the change in playing style when teams are chasing or sitting on leads.

This leader board isn't meant to represent that one team is better than another overall, merely that one team had the puck more than another. For example, the Toronto Maple Leafs found a way to be successful last year without having the puck very often at all, while

It was only a matter of time before a statistic like Corsi was used for individual players. In this context it is usually represented as a differential instead of a percentage, and expressed as a rate over 60 minutes of even-strength player. For example, Justin Williams led the NHL last season with a Corsi of +29.8. That means that Los Angeles enjoyed 29.8 more attempted shots than their opponents whenever Williams was on the ice. In contrast, Toronto had 31.7 fewer attempted shots than their opponents when Jay McClement was on the ice.

There are several complications when using Corsi for individuals instead of teams. The first one is obvious, since we know that Los Angeles had the puck 57.4% of the time and Toronto just 44.0% of the time. Clearly that partly explains why Williams is first and McClement is last. That's why a player's Corsi is usually expressed relative to how the team did without that player on the ice. For example, Williams has a Relative Corsi of +24.0, since the Kings had a Corsi of +29.8 when he was on the ice, and +5.8 when he wasn't. Similarly, McClement's Relative Corsi is -23.7, since the Leafs had a Corsi of -31.7 when he was on the ice, and -8.0 when he wasn't.

There's still another major difference between Williams and McClement, of course. When using Corsi (or any statistic at all) on an individual level, a player's usage is critical. A player's line mates, his usual opponents, and even the zone in which he's asked to play can affect all of their statistics, both traditional and non-traditional.

Williams, for example, played with the amazing Anze Kopitar and Dustin Brown on L.A's top line, and started 57.5% of his (non-neutral) shifts in the offensive zone. His most frequent opponents included mostly top-line forwards like Corey Perry and Zach Parise, but also checking specialists like Eric Nystrom. On the other hand, McClement started only 27.9% of his shifts in the offensive zone, played with Mikhail Grabovski and Nikolai Kulemin, and his most frequent opponents included almost exclusively players like Alexander Ovechkin, Daniel Alfredsson and Rick Nash.

Three statistics are used to measure a player's usage. The offensive boost a player receives is measured by his offensive zone start percentage, which is the percentage of non-neutral shifts that a player began in the offensive zone. The quality of one's competition and one's team mates is most commonly measured by the average Corsi (or Relative Corsi) of those players, and more recently by the average ice-time of one's opponents. This information is available in the NHL's game files, but helpfully parsed and presented by the same two websites mentioned above.

All of a player's usage information, and their Relative Corsi, can be presented on a Player Usage Chart, like the following example of the Toronto Maple Leafs, once again from Hockey Prospectus 2013-14. Throughout the season everyone can make their own Player Usage Charts using a tool at Hockey Abstract, at the risk of quickly losing all track of time.

(INSERT Toronto Maple Leafs Player Usage Chart HERE)

Mikhail Grabovski's mysterious drop from 0.70 points per game throughout 2010-11 and 2011-12 to just 0.33 last season is easy to understand with just the quickest of glances at Toronto's Player Usage Chart. He was used primarily in the defensive zone (horizontal axis), against top opponents (vertical axis), and alongside McClement and Kulemin. That would take the wind out of anyone's sails. Amazingly his Relative Corsi, which is represented by the sized and shaded bubbles, was actually still positive (shaded) despite his difficult assignment.

What can we learn from this? If Grabovski is used more like Phil Kessel and James van Riemsdyk in Washington, his scoring totals will return to normal. Likewise, if newcomer David Bolland assumes his old assignment in Toronto, his will tumble.

Shooting and Save Percentages

Everything we've covered so far is shot-based, which is largely the direction hockey analytics has been going lately. Of course, not all shots go in at the same rate, which brings us to PDO.

PDO doesn't actually stand for anything, it's the internet handle of a blog commenter that first suggested the idea of adding together a team's shooting and save percentages together. As such, PDO should usually be pretty close to 1000. The Los Angeles Kings, for example, had a shooting percentage of 9.2%, and a save percentage of .907, which totals 999 when added together. More recently a variation called Shooting Percentage Differential has been used, which is the same thing, but uses the shooting percentage of both teams, with one being subtracted from the other.

These statistics are helpful because teams with PDOs that stray particularly far from 1000 tend to correct themselves in relatively short order. The Maple Leafs, for example, had a team PDO of 1036 last year, the highest in the league, while Florida was dead last with 969. This indicates the potential change in fortunes of these two teams in 2013-14.

Naturally PDO has also been used for individual players as well. It is measured by examining the team's PDO only when that player is on the ice, and generally only in five-on-five manpower situations. A player with an unusually high PDO, like Chris Kunitz (1074) or Nazem Kadri (1063), will generally exceptional scoring and plus/minus statistics, while those whose PDO is unusually low, like Drew Shore (938), Olli Jokinen (939) and Jeff Skinner (940) will likely be seen as disappointments. However, PDO still has somewhat of a corrective factor, even for players.

The unpredictability of shooting and save percentages is another reason why statistical hockey analysts lean towards shot-based data. Even in the case of a player's offensive production, there's a clear preference for looking at the number of shots and passes a player attempts, rather than goals and assists. Since passes (that result in shots) aren't official recorded by the NHL, they're usually just estimates based on a player's primary assists and team on-ice shooting percentage.

Other Statistics

Just to touch briefly on goal-based data, ESP/60 is one of the more popular such statistics. It is a player's even-strength scoring rate over 60 minutes. For top-line forwards, for instance, it should be at least 1.7.

Some analysts also like to use all-in-one, catch-all statistics to get a preliminary overview of a team or a player before diving in for a deeper analysis. Others analysts, on the other hand, get almost blind with rage when estimates of any kind are employed.

Of those that make use of such metrics, including ESPN and teams like the Edmonton Oilers, Goals Versus Threshold (GVT) is the most popular. It measures all of a player's contributions, whether they're offensive, defensive, or in the shoot-out, relative to what you'd expect from an AHL-level call-up. The very similar Goals Versus Salary (GVS) is essentially the same thing, but measured relative to a player's cap hit instead.

One additional goal-based estimate of note is NHLe, which is the NHL equivalent of scoring data from another league, like in Europe, U.S. College for the Canadian Major Juniors. It is calculated by multiplying a player's scoring totals from another league by a translation factor that's based on those who previously moved from that league to the NHL. Various factors, including age for example, can also be taken into account. It's a nifty way to get a good idea of what to expect from a player coming to the NHL.

Goaltending

What about goalies? We'll be covering goaltenders in more detail in an upcoming issue, but briefly they're typically judged exclusively by their even-strength save percentage. It's not entirely helpful to include special teams play, since how many penalties a team takes and how effectively they kill penalties is largely outside a goaltender's control. These are also awfully small sample sizes, relative to five-on-five situations.

The only other popular goalie statistic is Quality Starts. Borrowed from baseball, it's meant to replace wins as an estimate of when a goalie players well enough for his team to win, independent of their offense, or how many shots they allow. More on that next time.

Closing Thoughts

Statistical analysis has made its way into everything from sports, business and politics. The expanded ability to record, manipulate and access data using modern technology has radically improved the reliability and availability of this type of information. The key to taking advantage of these great developments in hockey analytics is to get a firm grasp of the key concepts, and the language used to express them.

For more information, pick up a copy of my latest book Rob Vollman's Hockey Abstract, which explores the mainstream applications and limitations of hockey analytics in more detail.

Team CGF%

Los Angeles 57.4%

Chicago 55.8%

New Jersey 55.0%

Boston 54.4%

Detroit 53.9%

St. Louis 53.9%

NY Rangers 53.9%

Montreal 53.6%

San Jose 52.4%

Ottawa 52.1%

NY Islanders 52.0%

Vancouver 51.7%

Carolina 51.1%

Phoenix 50.2%

Pittsburgh 49.9%

Team CGF%

Winnipeg 49.7%

Florida 49.0%

Minnesota 48.7%

Philadelphia 48.5%

Calgary 48.2%

Anaheim 48.2%

Washington 47.7%

Dallas 47.3%

Colorado 46.7%

Nashville 45.9%

Columbus 45.4%

Tampa Bay 45.0%

Edmonton 44.5%

Toronto 44.0%

Buffalo 43.7%