Skip to main content

Posts

New Project Announcements

Recent posts

Can We Predict NHL Success from World Juniors Performance?

Every year during the World Juniors we hear two prevailing opinions on prospect analysis from fans and analysts: 1) it's too small of a sample size to mean anything and 2) this player will excel in the NHL because of his WJC performance. Obviously, there is a middle ground between these two perspectives, but to my knowledge, it hasn’t been explored in great detail. In this piece, I’ll be looking at what we can learn from World Juniors data. To perform this analysis, I’ll be using NHL career data from Rob Vollman link (I’m using a slightly older version cut off at the end of the 16-17 season), world juniors data collected from elite prospects, and draft data via hockey reference. The first decision I made was to consider only the World Junior performance from players that have been drafted. Although this is limiting, I think it’s safe to cut off the data to only players truly considered NHL prospects. The first exploratory step to look into is the relationship between Wo

New Work on Other Websites

I recently wrote a fan post on Broad Street Hockey breaking down how the Flyers have drafted since the lockout, as well as some general leaguewide draft analysis. Here's a link . Additionally, several of these graphs or a close equivalent were featured in this article  on the Islanders draft history by Arthur Staple from The Athletic. I also wrote a tutorial on using gganimate and R to create shot location game gifs. This can be found at  http://barloweanalytics.com/gganimate.html

Who Plays Where? Determining Skater Positions Using Clustering

While browsing through various different websites keeping NHL player stats, I realized that the league does a terrible job of keeping updated player positions. I’m not exactly sure how or where they get their data from, but it is quite inaccurate. All sites do distinguish between forwards and defenseman, which is enough for most analysis, but I still think more specific player positions hold value, especially when looking at team depth and related areas. In an attempt to solve this problem, I decided to use k-means clustering on location information within play-by-play data (thanks to Emmanuel Perry and Corsica Hockey for making this cleaned data available to the public). Clustering has been used pretty frequently in hockey analysis, most recently (I believe) to identify different styles of goal scorers by Alex Novet. It has also been used by Ryan Stimson to identify team and player styles with data collected from his passing project and similarly on DTM About Heart’s old blog

Tape to Tape Shot Visualization

In this post, I'll be breaking down my newest (and favorite) viz, which acts as a pretty comprehensive overview of tape to tape shot data. This is based on my previous tape to tape viz but has many new features. I'm going to go through each component of the display below, and explain how they work. You'll be able to work with the viz at the bottom of the page, and any feedback or suggestions are greatly appreciated. 1) The Rink First I'm going to explain what you're directly looking at. There are three parts to the rink: the points, the lines, and the tooltip (the box that pops out when you hover over a point/line). Both points and lines are colored by the event result. Goals are green, shots on goal are blue, and missed shots are tan. There are two different points: a circle and a square. Circles represent either where a pass was made or received. Squares represent the location of shot attempts. Lines show the flow of events. They grow in size as the eve

Introducing the Definitive Hart Trophy Metric: The Wyshynski Score

With the regular season coming to an end, and the Hart Trophy race at its most contentious in years, I planned to make a model predicting how the voting would turn out. Instead of going through the steps to determine which stats are most predictive and which stats measure the "player judged most valuable to his team", I decided to use the perfect criteria set forth by ESPN's Senior NHL Writer, Greg Wyshynski. These are by far the most logical and rational guidelines, so the results of this calculation are absolutely flawless. The Wyshynski Score is calculated as follows: Percent of Teams Points Scored by Player * 10 + (0.579 - Team PTS%) * (if < 0, 10, else 0)   + Player Points / 2nd Best Player on the Team Points + Player Points Per Game (20 GP. min) To validate this methodology, Taylor Hall is number one! so this method must be correct. Furthermore, Connor McDavid, the best and most valuable player in the league, is in 10th place- now we know

Measuring Consistency

A while back I decided to look into player consistency, but after doing initial calculations, I never went any further. After Namita Nandakumar's VANHAC  presentation on consistency, I decided to go back, refine my old work, and release the results. Namita's methodology is likely much more statistically relevant and meaningful, but nonetheless, I use a different approach that I think is worth sharing. The methodology I adopted was taken from this article on Nylon Calculus on NBA player consistency written by Hal Brown. This consistency metric gets the normalized variance of a player's performance for a given metric. In this post, I will be calculating the consistency of a player's game score  (GS) in individual seasons, from 2007-08 to 2015-16, using the data provided at the bottom of the linked game score article. I also have a folder with my code + better resolution graphs + data at the bottom of the article if you'd like to check it out. Calculation This