Saturday, December 21, 2013

1997-98 RAPM: prior Informed (RPI)

Background for what RAPM is: +/- was a revolution for the NBA because it allowed a completely new method at evaluating players. You look at how a team scores and defends with you on the court and without you. When you set players as variables, you can use regression to calculate player impact. It's a full scope view of what matters in a game: outscoring your opponent. However, it's noisy for a number of reasons. One is that some player combinations are rare (this is known as collinearity.) Another is that the models don't deal well with players with low minutes, as they don't have enough of a sample for an accurate estimate and will often produce a ludicrous result just to "fit" the data better. 

In simple terms, RAPM deals with this by introducing a heavy dose of regression to the mean. While traditional adjusted +/- creates a model by minimizing error (the difference between the actual points per possession scored/allowed and the expected), RAPM also minimizes the coefficients in the model using a lambda term. The coefficients are reduced toward the "prior," which can be set as zero or as a set of prior values (like the previous season's result.) Players with few possessions/minutes will have results close to their priors because their sample size isn't big enough to prove to the model they're more or less valuable.

Single season advanced +/- models are interesting, but they're prone to fluky results and collinearity issues, even with RAPM. This is where prior-informed comes from: instead of assuming all the players have the same value, use a starting point based on the results of the previous season. This "daisy-chain" style produces some of the most reliable and usable +/- stats, especially after you get three or four seasons in a chain. There's no official name for this yes in basketball circles, an easy nomenclature, so I'm using something similar to NPI (non-prior informed): RPI, which means it's a "pure" RAPM model informed only by a previous RAPM model. I've put the results in the same spreadsheet but another tab for a quick comparison.

The names at the top are all great to elite players with guys who have MVPs and a few surprising results to keep things interesting. One of the purposes of +/- is to identify players who have a positive but hidden impact on the game like Shane Battier. This is like a retroactive spotlight on the underrated games of the late 90's. Mookie Blaylock is rated as the third best player thanks to fantastic defense for a point guard and plenty of offensive value, even though his TS% was 46 (he relied on the shortened line, which ended in '97.) There are a lot of debates about Nash versus Stockton, and here's fuel for the fire, as Stockton rates very well in advanced +/-. Divac wasn't invited to many all-star games, but it appears his impact is worthy of a selection. Robert Horry, the man with seven titles, also looks like more than a role player here.

 *When you reference the spreadsheet, try to include the version number. This will reduce future discrepancies.

One new tweak was attempted for this model: using a different lambda for rookies. The common technique is to give all rookies the same prior, but in a prior-informed model this can be harsh for rookies. Obviously, a player with a stat from the previous season shouldn't be treated the same as a rookie. To deal with this problem, I used the same negative prior for rookies but cut the lambda in half (this is done with the penalty factor in R.) Consequently, Tim Duncan's great rookie season has been given freedom to shine. He's rated just a hair above David Robinson, which agrees with many subjective opinions on the value of the two big men, and 22nd overall. Only four other rookies were significantly above average: Brevin Knight at +3.6, who somehow didn't make the all-rookie team; Ron Mercer at +2.0; Keith Van Horn at +1.1; and Zydrunas Ilgauskas at +0.7.

The top offensive player was Karl Malone with Barkley and Shaq close behind. Tim Hardaway and Jordan round out the top five. This shows most of Shaq's impact was on offense even when he was younger. Barkley's an interesting result since this was past his prime; perhaps he really was an offensive savant. Sweet shooters Reggie Miller and Hornacek were a short distance away from Jordan on offense, suggesting that outside shooting was quite valuable even in the 90's. As for defense, Mutombo blows everyone away with a +5.7 rating and secures his Defensive Player of the Year trophy. Mourning, McKie, Ewing, and Tyrone Hill fill out the top five. David Robinson and Olajuwon weren't far behind, but this was past their respective primes. McKie's probably one of the most underrated role players of his era, and Hill is probably forgotten but he was part of those tough defensive Iverson/76ers teams.

As a last note, ten out of the top 23 players were on an all-NBA team, while the lowest rated player was Vin Baker at +0.2 and, perhaps not surprisingly, the lowest rated all-star was Antoine Walker at -2.5.

Click here for the link to the spreadsheet.


  1. Nice article. John Stockton comes out really well. Do you have any plans on doing RAPM for the 99 or 00 seasons?

    1. Yes I'll start those pretty soon. It just takes a while to fix the play by play data since there are no first names.