A Screaming Comes Across the Court: 2013 Retrodiction: How Player Metrics Predicted Wins

As the 2014 season starts, let's not forget the past and what we can learn. NBA player metrics are more popular than ever, but which ones do we trust? How useful are they? One method for evaluating metrics is to use the past values for each player to predict team wins in the current season. For example, for the Lakers we use Dwight Howard's PER in 2012 along with his minutes in 2013, we use Kobe Bryant's PER in 2012 along with his minutes in 2013, and so on until we have every player.

The metrics
PER: John Hollinger's invention. Explained here. Uses box score stats to create a player per-minute productivity value. Values usage highly. Adjusted for pace.
Win Shares: Uses box score stats to calculate individual wins. A large team defense factor is evenly divided among players. Values efficiency.

Wins Produced: Like Win Shares, uses box score stats to calculate individual wins. Instead of a large team defense factor, heavily values rebounding. Also values efficiency.

RAPM: Ridge-regression adjusted plus minus. Fundamentally based on whether or not a player's team scores more often or allows less points when he's on the court. Adjusts for pace, strength of schedule, among other factors. The "ridge" part regresses heavily to a prior value based on possessions, i.e. it's hard for a player to move away from his prior if he rarely plays.

-npi RAPM: Non-prior informed RAPM. The priors are all set to 0.

-Vanilla: Uses the previous three seasons of npi RAPM. Most recent seasons are weighed more heavily.

-RAPM: The prior is the previous season's RAPM. The starting point is the first dataset used (2001.)

-xRAPM: A mix of RAPM and a statistical plus/model using box score stats and height.

To rate the various metrics, we can use what's known as a root-mean square error. Sum all the squared differences between wins and predicted wins, and then take the square root. The effect of this is to penalize the biggest errors, and then to calculate the root mean (the final units are in wins, not squared wins.)

Which metric won the 2013 retrodiction? xRAPM blew away every other metric. A root-mean squared error under 6 is extremely good for a prediction for the regular season. Although knowing the minutes distribution is a huge advantage, it's worth noting that the best analysts and Vegas topped out at 5.94 for the 2013 season. (Interestingly, while PER did not perform well, Hollinger's own predictions did.) For the worst performing metric, that is Wins Produced. Only PER approaches how poorly Wins Produced did, but that's not with the "real" wins. Most metrics had minor differences in how they predicted real wins and Pythagorean wins, but PER "lucked out" that real wins trended much closer to its own prediction. Win Shares does extremely well and beats out the other adjusted plus/minus methods, but we won't stop here.

........	PER	Win Shares	Wins Produced	npi RAPM	Vanilla RAPM	RAPM	xRAPM
Real wins	7.00	6.62	7.95	7.51	7.27	6.74	5.67
Pyth. wins	7.98	6.93	7.75	7.41	7.23	7.11	5.85

Since you're really not testing anything on teams with low roster turnover, I used the percentage of minutes from new players excluding rookies to calculate squared errors at three points: no new players, half the minutes from new players, and a completely new team. Why is this important? Box score metrics assume defense is explained by rebounds/blocks/steals, and since a good defender will force more missed shots (i.e. rebounds), it's hard to test this without the player switching teams.

The results are shown in the table below. The most relevant line is 50% because the team with the most roster turnover topped out at 59.3%. The "Win" metrics do very well when the roster doesn't change, which isn't surprising because their metrics are built from offensive/defensive team measures, but they start to perform horribly with heavy roster change. Weirdly, PER doesn't change much when the roster does for real wins; there was something kooky going on with PER trying to explain teams with roster change (case in point: the Rockets gained Asik, who PER underrates because he's a defensive player who doesn't shoot well, but the Rockets underperformed their point differential.)

The RAPM metrics are rough in estimating wins on teams with no roster turnover, but this makes sense: plus/minus models are known for their noise. However, they do much better with roster turnover, especially the metrics that use priors: xRAPM and RAPM. This is despite completely whiffing on the Lakers, which the box score metrics did not do, but that team struggled with injuries and chemistry.

PER does the best at 100% minutes from new players, but, again, the 50% line is more relevant and it's more the case that PER was horrible at teams with no roster change, making them look better when projected out to 100. As a last note, non-prior forms of RAPM are generally disregarded, but they still appear to be as predictive as widely used box score metrics like Win Shares.

	% of mins. from new players	PER	Win Shares	Wins Produced	npi RAPM	Vanilla RAPM	RAPM	xRAPM
Real wins	0	41.3	-3.6	3.1	19.7	16.3	24.8	16.4
	50	53.5	71.4	98.3	77.7	74.1	57.4	41.4
	100	65.6	146.3	193.4	135.7	131.9	90.0	66.4
Pyth. wins	0	43.7	0.2	15.7	21.7	26.2	22.1	20.6
	50	75.3	75.9	85.9	74.1	67.4	67.0	42.2
	100	106.8	151.7	156.2	126.5	108.7	111.9	63.8

How to translate the metrics to wins

For Win Shares and Wins Produced, the work has already been done. Just multiply by minutes.

PER has a companion stat for wins: EWA (estimated wins added.) It's explained here:
VA: Value Added - the estimated number of points a player adds to a team’s season total above what a 'replacement player' (for instance, the 12th man on the roster) would produce. Value Added = ([Minutes * (PER - PRL)] / 67). PRL (Position Replacement Level) = 11.5 for power forwards, 11.0 for point guards, 10.6 for centers, 10.5 for shooting guards and small forwards

EWA: Estimated Wins Added - Value Added divided by 30, giving the estimated number of wins a player adds to a team’s season total above what a 'replacement player' would produce.

Plus/minus stats are translated to wins using the Pythagorean method. The formula is points scored^14/(points scored^14+points allowed^14). The expected plus/minus of the team is translated into points scored versus points allowed relative to the league average.

Rookies are assumed to be heavy negatives. The average value for rookies was estimated from past values.
WS/48 mins: 0.05
PER: 13
WP/48 mins: 0.05
Plus/minus (RAPM's): -1.96

3 comments:

TonyOctober 30, 2013 at 10:40 PM
Awesome stuff, thanks!
ffNovember 1, 2013 at 12:11 AM
Have you tried inserting 2011 or even 2010 numbers to see which metric predicted 2013 the best? I would love to see which one is best at predictions long term.
Neil PaineNovember 4, 2013 at 4:35 PM
@ff - This might be of interest: http://www.apbr.org/metrics/viewtopic.php?p=15334&sid=0795ee85bf0fcf39da857f6975e373ec#p15334

Wednesday, October 30, 2013

2013 Retrodiction: How Player Metrics Predicted Wins

3 comments: