Skip to main content

Era Adjustments (Part 3)

All data via nhl.com

In this update, I have added the first 50 NHL seasons to my data, as well as changed the source to the newest NHL.com stats, which are described as more accurate. Now, in summary, I have the adjusted points values for every NHL season using the most accurate point totals available. I also created some new visuals and a hall of fame score, which will be explained below.

As a note, for seasonal and career stats, only seasons were used in which the given player played: GP >= max possible GP / 2. This was done to ensure the distribution represented all NHL regulars. You may notice some seasons are missing or point totals are off (ex. Marcel Dionne's '88-89 was omitted from analysis and totals because he only played in 37 games, and 37 < 80/2.

Here are updated graphs from my previous parts. HOF analysis will be below and updated Tableau


 As you may notice, the earlier seasons of the NHL were much more volatile, which we should expect as the league was just forming and many variables were changing during this time, such as GP, # of players, # of teams, etc. The trends mentioned in earlier parts appear much less pronounced, but multipliers for top players remained in the same general range.

 Here is the graph of top seasons in history, still dominated by post-expansion players. Most players fall under a negative adjustment, which is displayed as under the white line y=x. The one cluster of points forming a trend at a greater slope (where 25 pts. -> 60 adjusted pts.) consists of players in the first several seasons in which the point rate was much less.
Similarly, here are the career leaders, although the 80s still dominate (in part due to more total GP as there were generally longer seasons and careers than in the past + new era of players haven't retired yet). Nonetheless, there is a much better mix of talent, which helps achieve the original goal of adjusting points.


Just as last time, here are players visualized by hall of fame status. We see that adjusted points generally does a good job of distinguishing which players make the HOF, especially for forwards. Most players in the top right region are either in the HOF or ineligible. As this appeared to be a good indicator, I created a HOF likelihood using logistic regression with the data. With inputs of GP, PPG, adj.P, and Position (F/D) the model was trained using all eligible (including HOFers) careers. The result gives the likelihood a player will make the HOF according to the data. Although position was a statistically significant variable, defensemen simply aren't judged by points on the same level as forwards so the model had a tough time predicting defensemen. On the other hand, the results for forwards appear to have much more value. Overall, HOF status can't be predicted with this few variables as voters place emphasis on awards won as well as international play, which are both out of the current scope. In Stat Shot by Rob Vollman, a HOF model is developed which accounts for these variables and performs very well.




As shown in the visual, the model was applied to ineligible players to get a sense of who we should expect to make the HOF in future years. Here are the top 20s separated by position. Defensemen were given much lower likelihoods than forwards as a group, but Erik Karlsson stands out as an almost lock. We should expect Zubov and Gonchar to eventually make their way in, as expected. The long list of eligibles following the top group are much less likely to get in both on the model's predictions and by looking at the names. Several are past the point of eligibility and the remaining are a group of fringe contenders. Towards the bottom we see Letang leading the group of current players. Other active players of interest with (in my opinion) a likelihood of entering the HOF include Zdeno Chara (0.38) and Drew Doughty (0.22). Showing again that without award recognition, defensive defenseman are ignored.
The forwards model provides a much clearer picture of what to expect. The top 5 are complete locks by the model and by general public opinion. In the next group are several players that are nearly identical stat-wise to many of the early inductees, but my hockey history knowledge cannot explain why they are omitted and others are not. The next grouping is a list of current era players, most of whom will likely be strong candidates for the HOF in the future. McDavid stands out because of his very strong PPG numbers. When setting a minimum for games played, the list is pretty much what is expected in this current batch of players and Herberts, Denneny, and Carson are omitted.

Here are some scores for players in the HOF as a basis for comparison:
Gretzky - 0.9999280 (Highest F)
Orr - 0.9988738 (Highest D)
Lidstrom- 0.9550488 (5th D)
Maurice Richard- 0.9508771 (17th F)
Leetch - 0.9231727 (8th D)
Selanne - 0.9199010 (21st F)
Lindros - 0.8099994 (42nd F)
Hawerchuk - 0.7659057 (48th F)
Bure - 0.6784484 (65th F)
Pronger - 0.66935832 (20th D)
The Worst Scores:

Bob Gainey - 0.002817570 (Lowest F)
Jack Stewart - 0.003486746 (Lowest D)


Finally, here is the updated Tableau for NHL careers. If you mouse over a player you can see their HOF score and the other data used in model.



Comments

Popular posts from this blog

Tape to Tape Shot Visualization

In this post, I'll be breaking down my newest (and favorite) viz, which acts as a pretty comprehensive overview of tape to tape shot data. This is based on my previous tape to tape viz but has many new features. I'm going to go through each component of the display below, and explain how they work. You'll be able to work with the viz at the bottom of the page, and any feedback or suggestions are greatly appreciated. 1) The Rink First I'm going to explain what you're directly looking at. There are three parts to the rink: the points, the lines, and the tooltip (the box that pops out when you hover over a point/line). Both points and lines are colored by the event result. Goals are green, shots on goal are blue, and missed shots are tan. There are two different points: a circle and a square. Circles represent either where a pass was made or received. Squares represent the location of shot attempts. Lines show the flow of events. They grow in size as the eve

Tape to Tape Tracker Visualization

tapetotapetracker.com has created an excellent way to track shots, shot assists, and zone entries. Using an 11 game sample of 5v5 data from  here  provided by Prashanth Iyer, I created a Tableau visualization to map the shot and shot assist data. This data includes all shot attempts classified by type, and when relevant the pass leading up to the shot. The "origin" is where the passer makes the pass, and "destination" is where the pass is received. Finally, each shot is denoted as "goal" "shot"- SOG or "missed shot", and the location is where the shot was taken. Some features include viewing by the shooter, passer, team, and game. You can also select specific events and results. Result filters an entire event by what its end result was. For example, if "goal" is selected, it will show all events (origin+destination+shot) which resulted in a goal. Similarly, event filters for individual events. This means that specific types

Who Plays Where? Determining Skater Positions Using Clustering

While browsing through various different websites keeping NHL player stats, I realized that the league does a terrible job of keeping updated player positions. I’m not exactly sure how or where they get their data from, but it is quite inaccurate. All sites do distinguish between forwards and defenseman, which is enough for most analysis, but I still think more specific player positions hold value, especially when looking at team depth and related areas. In an attempt to solve this problem, I decided to use k-means clustering on location information within play-by-play data (thanks to Emmanuel Perry and Corsica Hockey for making this cleaned data available to the public). Clustering has been used pretty frequently in hockey analysis, most recently (I believe) to identify different styles of goal scorers by Alex Novet. It has also been used by Ryan Stimson to identify team and player styles with data collected from his passing project and similarly on DTM About Heart’s old blog