Goal rates via hockey-reference.com
Historical player data from hockeyabstract.com
**Part 2 Includes New Formula For Adjusted Points and Initial HOF tableau viz** http://threepointgames.blogspot.com/2018/01/era-adjustments-continued_2.html **
**Part 3 Includes HOF likelihood score and Tableau with all 100 seasons and more accurate data**
http://threepointgames.blogspot.com/2018/01/era-adjustments-part-3.html
One of the biggest questions hockey fans struggle to answer is how to compare players and teams across different eras. The style of hockey in the high-scoring 80s is certainly different from the dead puck era in the 90s, which are both remarkably different from the current edition of NHL hockey in the salary cap era. During these time periods, drastic changes have taken place around the league, from new rules to expansion, and as a result, we struggle when attempting to compare old and new achievements and records.
Many explanations have been given for the changing scoring rates by season, and one of the most viable is the expansion effect. By increasing the size of the league, talent was much more dispersed, significantly widening the gap between the best and the worst players around the league. This raised scoring and subsequently gave top talents a platform to dominate. Over time, rule changes and the increased popularity of hockey have provided a talent pool deep enough that more teams can be sustained, and the talent gap has decreased. This explanation logically makes sense and can be backed up by big-picture analysis, but there have been no efforts to quantify these changes.
In this article, I attempt to answer how talent distribution has changed in the league, and through this how to adjust player performances so that we can better compare feats such as Wayne Gretzky’s 200+ point record-setting seasons to the dominant 100+ point seasons we have seen from Sidney Crosby.
Using the Gini coefficient with the previously calculated Lorenz curves, we can make several observations. The first is a view of Gini coefficient over each season, accompanied by a horizontal line of the average value. From 1967-76 the league experiences a shift from 6 to 18 teams, and the coefficient has modest increases, but overall remains essentially unchanged. This goes against original conclusions as to the immediate effects of expansion. During the next era from 1976 to 1995, we observe the greatest increases. This era is also when the talent gap is generally described to be the most prominent. The changing distribution of talent is arguably still a result of expansion, however, this graph suggests the immediacy may not be as prevalent. Moving on, from the dead puck era to present day, the Gini coefficient begins its decline, which reflects the general consensus amongst fans and writers that the talent gap is shrinking.
===
Now that we have observed that talent distribution is, in fact, a changing variable throughout NHL history, we will take a new approach to compare players across eras. Using this knowledge we can try to better answer: How would Wayne Gretzky’s performance relative to his peers look if he played in the NHL today?
Once each season was calculated, I computed a multiplier for each percentile by comparing the base year to the year in question, simply dividing the base curve by the given year curve. In this example, we can see the 1967-68 distribution(first season) compared to the 2015-16 distribution(most recent season). When the 1968 curve is below the base season curve (2016), the multiplier is less than one, signifying that the specific percentile of players contributed more points than that percentile would in the base season. When the multiplier is greater than one, it indicates that a player performing at that percentile contributed fewer points than if they were playing in the base season. When this multiplier is applied to every player in a given season, the overall distribution of points for that season will reflect a similar distribution to points in the base year. Since it is done at a percentile level, the adjustments made take into the account the performance of multiple players, so a player isn’t penalized as much for their own performance.
When graphing just these findings, we learn that the era adjustment is not too extreme in its magnitude, but still has a strong effect. Since the 2015-16 season was one of a rather even distribution, most players in the historic sample have a multiple of less than 1, meaning their adjusted PTS were less than their actual PTS. Players during the 70s were usually the recipients of the increases. Overall, the era adjustment had a maximum multiplier of approximately 3 and a minimum of approximately 0.7. These extremes were only present in players in the first few percentiles where a 1 point season was changed to 3 points and vice versa. Amongst prominent players, the adjustment ranged from +/- 40 points, with multipliers between 1.3 and 0.8.
When combined with the typical adjustment for goal scoring rates, we can begin to get more realistic comparisons between eras. Overall, the most points added was Martin St. Louis, with 33 points added to his 03-04 total bringing it from 94 to 127. On the other hand, the subtractions were much more significant. Gretzky and Lemieux lost 82 points each in their 81-82 and 88-89 seasons respectively, bringing their totals to 130 and 117. These point totals seem much more achievable, yet still reflect the dominance these players showed in their careers.
Overall, this combined adjustment does a great job of improving our comparisons between players from different eras, although the multipliers are not strong enough to avoid “favoring” historic contributions by high scoring players of the '80s and '90s (or the top players today are just relatively worse). The adjustment definitely has room to be improved, but I believe this helps our understanding of how players performed in a given season compared to the current NHL.
In future renditions, there are several improvements I hope to make. For one, applying the goal scoring multiplier to points assumes that the relationship is even, which I am not too confident in. Finding the points/GP rate and applying that could help for specifically studying PTS. I also hastily applied this same approach to goals as I did for points, and found interesting results. In the future perhaps I’ll use this methodology to see who the best goal-scorer of all-time is. I also hope to refine the method for calculating the multipliers, as I would like to make sure the percentile is reflective of league distribution rather than being too affected by an individual player. The NHL also recently released updated data that is more accurate and goes further in the past. Using this dataset will improve the calculation accuracy and allow for the full 100 year history of the league to be analyzed, rather than the last 50. I’d also like to add the 16-17 season.
Below are top 20 lists for PPG, career PTS and individual season PTS. Note that goaldif + PTS.dif does not equal netdif, as they are results of separate multipliers. For example a 100 point season with multipliers of 0.5 and 0.5 would have two differences of 50 points, but the net difference would be -75 (total 25 pts.) not -100 (total 0). Also note that Malkin, Iginla, and Thornton were left off of the NHL top 100.
If you have any questions or comments, would like to see the complete data, my code, or any additional visuals, please email me at jflancer@wharton.upenn.edu or comment below!
Data from 1967-68 to 2015-16 Seasons- Ignoring Lockout Seasons ('94-95, '12-13)
Comments
Post a Comment