The metrics
PER: John Hollinger's invention. Explained here. Uses box score stats to create a player per-minute productivity value. Values usage highly. Adjusted for pace.
Win Shares: Uses box score stats to calculate individual wins. A large team defense factor is evenly divided among players. Values efficiency.
Wins Produced: Like Win Shares, uses box score stats to calculate individual wins. Instead of a large team defense factor, heavily values rebounding. Also values efficiency.
RAPM: Ridge-regression adjusted plus minus. Fundamentally based on whether or not a player's team scores more often or allows less points when he's on the court. Adjusts for pace, strength of schedule, among other factors. The "ridge" part regresses heavily to a prior value based on possessions, i.e. it's hard for a player to move away from his prior if he rarely plays.
-npi RAPM: Non-prior informed RAPM. The priors are all set to 0.
-Vanilla: Uses the previous three seasons of npi RAPM. Most recent seasons are weighed more heavily.
-RAPM: The prior is the previous season's RAPM. The starting point is the first dataset used (2001.)
-xRAPM: A mix of RAPM and a statistical plus/model using box score stats and height.
To rate the various metrics, we can use what's known as a root-mean square error. Sum all the squared differences between wins and predicted wins, and then take the square root. The effect of this is to penalize the biggest errors, and then to calculate the root mean (the final units are in wins, not squared wins.)
Which metric won the 2013 retrodiction? xRAPM blew away every other metric. A root-mean squared error under 6 is extremely good for a prediction for the regular season. Although knowing the minutes distribution is a huge advantage, it's worth noting that the best analysts and Vegas topped out at 5.94 for the 2013 season. (Interestingly, while PER did not perform well, Hollinger's own predictions did.) For the worst performing metric, that is Wins Produced. Only PER approaches how poorly Wins Produced did, but that's not with the "real" wins. Most metrics had minor differences in how they predicted real wins and Pythagorean wins, but PER "lucked out" that real wins trended much closer to its own prediction. Win Shares does extremely well and beats out the other adjusted plus/minus methods, but we won't stop here.
PER
|
Win Shares
|
Wins Produced
|
npi RAPM
|
Vanilla RAPM
|
RAPM
|
xRAPM
|
|
Real wins
|
7.00
|
6.62
|
7.95
|
7.51
|
7.27
|
6.74
|
5.67
|
Pyth. wins
|
7.98
|
6.93
|
7.75
|
7.41
|
7.23
|
7.11
|
5.85
|
The results are shown in the table below. The most relevant line is 50% because the team with the most roster turnover topped out at 59.3%. The "Win" metrics do very well when the roster doesn't change, which isn't surprising because their metrics are built from offensive/defensive team measures, but they start to perform horribly with heavy roster change. Weirdly, PER doesn't change much when the roster does for real wins; there was something kooky going on with PER trying to explain teams with roster change (case in point: the Rockets gained Asik, who PER underrates because he's a defensive player who doesn't shoot well, but the Rockets underperformed their point differential.)
The RAPM metrics are rough in estimating wins on teams with no roster turnover, but this makes sense: plus/minus models are known for their noise. However, they do much better with roster turnover, especially the metrics that use priors: xRAPM and RAPM. This is despite completely whiffing on the Lakers, which the box score metrics did not do, but that team struggled with injuries and chemistry.
PER does the best at 100% minutes from new players, but, again, the 50% line is more relevant and it's more the case that PER was horrible at teams with no roster change, making them look better when projected out to 100. As a last note, non-prior forms of RAPM are generally disregarded, but they still appear to be as predictive as widely used box score metrics like Win Shares.
% of mins. from new
players
|
PER
|
Win Shares
|
Wins Produced
|
npi RAPM
|
Vanilla RAPM
|
RAPM
|
xRAPM
|
|
Real wins
|
0
|
41.3
|
-3.6
|
3.1
|
19.7
|
16.3
|
24.8
|
16.4
|
50
|
53.5
|
71.4
|
98.3
|
77.7
|
74.1
|
57.4
|
41.4
|
|
100
|
65.6
|
146.3
|
193.4
|
135.7
|
131.9
|
90.0
|
66.4
|
|
Pyth. wins
|
0
|
43.7
|
0.2
|
15.7
|
21.7
|
26.2
|
22.1
|
20.6
|
50
|
75.3
|
75.9
|
85.9
|
74.1
|
67.4
|
67.0
|
42.2
|
|
100
|
106.8
|
151.7
|
156.2
|
126.5
|
108.7
|
111.9
|
63.8
|
How to translate the metrics to wins
For Win Shares and Wins Produced, the work has already been done. Just multiply by minutes.
PER has a companion stat for wins: EWA (estimated wins added.) It's explained here:
VA: Value Added - the estimated number of points a player adds to a team’s season total above what a 'replacement player' (for instance, the 12th man on the roster) would produce. Value Added = ([Minutes * (PER - PRL)] / 67). PRL (Position Replacement Level) = 11.5 for power forwards, 11.0 for point guards, 10.6 for centers, 10.5 for shooting guards and small forwards
EWA: Estimated Wins Added - Value Added divided by 30, giving the estimated number of wins a player adds to a team’s season total above what a 'replacement player' would produce.
Plus/minus stats are translated to wins using the Pythagorean method. The formula is points scored^14/(points scored^14+points allowed^14). The expected plus/minus of the team is translated into points scored versus points allowed relative to the league average.
Rookies are assumed to be heavy negatives. The average value for rookies was estimated from past values.
WS/48 mins: 0.05
PER: 13
WP/48 mins: 0.05
Plus/minus (RAPM's): -1.96
Awesome stuff, thanks!
ReplyDeleteHave you tried inserting 2011 or even 2010 numbers to see which metric predicted 2013 the best? I would love to see which one is best at predictions long term.
ReplyDelete@ff - This might be of interest: http://www.apbr.org/metrics/viewtopic.php?p=15334&sid=0795ee85bf0fcf39da857f6975e373ec#p15334
ReplyDelete