Monday, May 27, 2013

Average Draft Measurements for NBA Players

As the 2013 draft approaches, we'll see more and more commentary about why or why not certain players will make it in the league. One common standard on which to judge players is through physical measurements. If a power forward is measured at 6' 7" or even 6' 8" without shoes, for example, he is said to be too short for the position, but is that actually true?

With years of measurements from the NBA draft combine, there's now a plethora of information of NBA player beyond the typical team listed height, which are often inflated. Measurements included here are height without shoes, height with shoes, wingspan, and standing reach, as well as the height listed in basketball-reference. The average measurement for a position is based on minutes played -- if there's a 6' 10" player with 1000 minutes and a 6' 8" player with 500, the average is 6' 9.3" -- and the position used is from basketball-reference. That system for position is far from perfect, but no system is, especially since positions have become so fluid.

Results

The results are shown in the table below. Shifted measurements have minutes at each position balanced so they all have the same total amount (i.e. if centers have less minutes, some power forward minutes are moved there.) OLS regression was used to fill in the missing data where each set of measurements has two or three equations based on the available information (to predict standing reach, wingspan is valuable information, but if that's not available move on to the next equation.)

Average measurements
Position
Height without shoes
Height with shoes
B-ref listed height
Wingspan
Standing reach
PG
6’ 1.2”
6’ 2.3”
6’ 1.9”
6’ 5.3”
8’ 1.6”
SG
6’ 3.8”
6’ 5.0”
6’ 4.7”
6’ 7.9”
8’ 4.8”
SF
6’ 6.6”
6’ 7.9”
6’ 7.6”
6’ 11.0”
8’ 8.8”
PF
6’ 7.7”
6’ 8.9”
6’ 9.2”
7’ 1.1”
8’ 11.1”
C
6’ 9.8”
6’ 11.1”
6’ 10.8”
7’ 3.3”
9’ 1.7”

Position shifted average measurements
Position
Height without shoes
Height with shoes
B-ref listed height
Wingspan
Standing reach
PG
6’ 1.3”
6’ 2.4”
6’ 2.0”
6’ 5.4”
8’ 1.7”
SG
6’ 3.8”
6’ 5.0”
6’ 4.7”
6’ 7.9”
8’ 4.8”
SF
6’ 6.5”
6’ 7.8”
6’ 7.5”
6’ 10.8”
8’ 8.6”
PF
6’ 7.6”
6’ 8.8”
6’ 9.0”
7’ 1.0”
8’ 10.9”
C
6’ 9.6”
6’ 10.9”
6’ 10.7”
7’ 3.1”
9’ 1.4”

Not every player has a draft combine measurement, and for those regression equations were created to fill in the gaps. The results are only slightly different, but it is important to see if the results even change, and the equations can be used to estimate measurements of other players. One random find was that players close to seven-feet tall are no more likely to inflate their height than other players. I thought there'd be a whole number bias, but I suppose not.

Average measurements, missing data regressed
Position
Height without shoes
Wingspan
Standing reach
PG
6’ 1.1”
6’ 5.3”
8’ 1.2”
SG
6’ 3.8”
6’ 8.1”
8’ 4.8”
SF
6’ 6.5”
6’ 11,0”
8’ 8.7”
PF
6’ 7.9”
7’ 1.3”
8’ 11.2”
C
6’ 9.9”
7’ 3.2”
9’ 1.7”

Position shifted average measurements, missing data regressed
Position
Height without shoes
Wingspan
Standing reach
PG
6’ 1.2”
6’ 5.4”
8’ 1.3”
SG
6’ 3.8”
6’ 8.1”
8’ 4.8”
SF
6’ 6.4”
6’ 10.9”
8’ 8.5”
PF
6’ 7.8”
7’ 1.1”
8’ 11.0”
C
6’ 9.7”
7’ 3.0”
9’ 1.4”

Draft implications

When evaluating a player's size, looking at the average measure at each position makes it more of a fair game. Too often guys are derided for not being true centers, for instance, because they're not seven-feet tall, but that's the norm. Nerlens Noel is the top prospect in the draft, and some fear he's too small for a center. He was 6' 10" without shoes, almost perfectly average, with a slightly above average wingspan of 7' 3.75" and standing reach 9' 2". Although the weight is a concern in post defense, he's nearly the league average measurements for a center.

In the case of Trey Burke, measurements mean the difference between being taken second and sixth. Without shoes, he was a paltry 5' 11.75", sliding under the Mendoza line for NBA player height, but looking at his effective size -- you don't play basketball with the top of your head; you play it with your hands -- he's actually fine. Both his wingspan and standing reach are about average at 6' 5.5" and 8' 1.5", respectively.

For players who are undersized, Anthony Bennett is a shade under the averages for a power forward, from the measurements we have, and Cody Zeller's T-rex arms are a killer. His height was above average, 6' 10.75" without shoes and over 7' with them, but his wingspan was slightly below average for a small forward -- 6' 10.75", as having a wingspan the same as your height is actually rare for an NBA player. Consequently, his standing reach is below average for even a power forward at 8' 10". Speaking of short arms, Kelly Olynyk actually has a smaller wingspan than height without shoes -- 6' 9.75" to 6' 10.75", but at least his standing reach is 9'.

For some better news, Otto Porter is a big wing. His size is more similar to a power forward's -- a wingspan of 7' 1.5" and a standing reach of 8' 9.5", along with a height without shoes of 6' 7.5". If a team wants to go with a smallball lineup, you can definitely get away with Porter as the "four" there. Steven Adams appears to have good size for a center, as his standing reach is almost exactly average at 9' 1.5", and his wingspan is 7' 4.5". But it's Rudy Gorbert who steals the headlines. He has a Condor-like wingspan of 7' 8.5", largest in recorded draft combine history, and a standing reach of 9' 7", only bested by the Russian giant Pavel Podkolzine. He's also the only guy in the 2013 draft measured at over seven-feet without his shoes, which is actually rare. Athletically, he's not up to par, but strangely enough his speed and jumping measurements were very similar to Brendan Haywood's, who had a long career, and perhaps Gorbert can be the next Shawn Bradley -- and not the next Pavel Podkolzine.

As a last note, NBA projections typically use a player's listed height and ignore wingspan and standing reach because there's more data available that way. Looking at average measurements perhaps one can adjust a player's height based on his wingspan and standing reach to an effective height. For example, Cody Zeller's measurements are somewhere in between a SF and a PF, and the average listed height between those two positions is 6' 8.3", depending on whether or not you use shifted measurements. Using 6' 8" instead of his listed height of 7' could dramatically change his projected impact.

Monday, May 20, 2013

How Wins Produced Fails with Rondo

I had always wanted to look at the basic on/off statistics with Rondo and the Celtics because I felt he was an overrated point guard, and the 2013 season provided an interesting data set: he went down halfway through the season and the Celtics improved, eventually building up to the point where they found a decent lower seed and gave the Knicks all the trouble they could handle. Rondo is a good case in misleading box score stats. Even though he regularly leads the league in assists and sometimes in steals, he does not have a sizable positive offensive impact on the Celtics -- at times he becomes obsessed with piling up assists, instead of finding the ideal offensive options; the Celtics have been a disappointing offensive team for years, getting worse as Rondo's responsibilities grew even with hall of famers on his team, ranking 24th this season and 27th and 17th the two before; and his horrendous outside shooting leads to teams basically refusing to guard him, sagging off five feet and using his man as a roaming help defender. He's a divisive player, with some loving his tenacity and high assist totals enough to call him a top player and a top 2 point guard, while others prefer a string of different point guards before Rondo and find his offense troubling. Rajon Rondo: unselfish star with a unique game, or surly guard who selfishly pads assists?

This is also another article in a series detailing how Wins Produced fails. Previous editions include Carlos Boozer and Jose Calderon, as well one breaking down how the metric works (and fails.) Rondo is an odd case for Wins Produced at first glance, but now if you know its shortcomings well. Yes, he's an inefficient shooter, but he shoots so rarely that it's not a significant deficit. For his position, he's a great rebounder, which Wins Produced loves, and picks up steals and, of course, plenty of assists. Even just picking up rebounds disproportionately well for your position translates into a high Wins Produced score; just ask Landry Fields, who as a rookie apparently was at the same level as Iguodala, Pierce, and Ginobili in 2011.

For his career, Rondo has an outstanding Wins Produced score, while topping out at around 15 wins in both 2009 and 2010, and sixth in Wins per minute in 2009. He is a bona fide star by this metric, but upon closer examination that is far from the truth.

Methodology

The basis of this small study is to look at how the Celtics fare when Rondo plays compared to how they fare without him. I wanted a simple, transparent method that anyone can follow -- no strange flavors of regression models here. Basketball articles regularly cite team records with and without a player, but those typically run into two problems: 1) it's not adjusted for the strength of the competition, and 2) a small sample size leads to unreliable and noisy results. Points 1 and 2 coupled together lead to crazy conclusions, and it's more common than you think because away games are often chained together and sometimes lead to a buzzsaw week in scheduling like three games in Texas and then one in Oklahoma City.

Games with Rondo are compared to games without Rondo in the same season. If you simply total up stats with and without Rondo over his career, the results will be biased toward seasons in which he's healthy, and those happen to be earlier in his career when Boston was at full strength with a younger Pierce/Garnett/Allen core. (With players who miss a consistent number of games per year like Calderon you can get away with this.) I looked at Boston's record and their adjusted point differential for a given season when Rondo played, and then at their record and point differential when he didn't. The difference is Rondo's impact, and you can total this for every season.

For translating point differential into wins, there's the popular Pythagorean theorem here (scroll down to W) and another simpler formula here (scroll down to P.) With small sample sizes just using win/loss percentage is unreliable. Think about a team that normally wins 75% of its games but plays two without a star, as an extreme example; it'd be impossible to deduce how good they really were without him. I also calculated offensive and defensive efficiency for every game, compiling the results for with/without splits and adjusting for opponent strength in respective off/def efficiency.

Wins Produced makes it easy to estimate how many wins a team loses without a player. TheNBAGeek provides all those stats, where the hardest part was estimate the replacement production. I had two methods: 1) assume replacement level is 0.100 Wins/48 minutes, which is exactly average, and calculate the loss in wins with (Rondo WP/48 mins - 0.100/48 minutes)*Games*Minutes/game, and 2) use the same calculations but find the replacement level by taking the weighted average of his backups, weighing by minutes. For 2), whenever Rondo only missed 1 to 5 games in a season, I just looked at the box scores of games he didn't play and who replaced him with how many minutes, but ultimately that was still tricky because Celtics often had combo guards on the bench like House. For seasons with considerable games missed, I took an average using the top lineups without Rondo (from their top 20 in general) and weighted by minutes. This is not perfect by any stretch, but it's a better estimate than 0.100. However, it's a much better than simply looking at with/without splits, and a replacement level of 0.100 is a conservative estimate used to say, Okay, this is a high guess, and if Rondo can't beat that then there's no contest because their backups are definitely worse than 0.100 WP/48 (Boston's backup point guards have included Stephon Marbury and Nate Robinson.)

Here's a little more detail on how I find statistical significance: in previous editions I used a standard deviation of Wins Produced along with a t-test. The standard deviation comes from a different player and isn't exactly applicable. I've decided to use a different test. The hypothesis is that Wins Produced is accurate in its assessment of how many wins Rondo is worth, but there's game to game variation in how the team wins. This variation is found through the binomial distribution, since games have binary outcomes (win/loss.) The binom.dist function in Excel works well for this, where I find the probability that their actual record is statistically indistinguishable from what Wins Produced says due to natural variation with their specific sample size of games without Rondo.

Results

Tossing out Rondo's rookie season, when he played fewer minutes and it can be argued rookie seasons aren't representative of players, there are 78 games without Rondo and 392 with him, excluding the playoffs. He only missed sizable chunks the past three seasons. Their adjusted point differential (SRS) improved in two of those seasons and got worse in one, 2011, and it's the same for the other three seasons -- only one they improved in was 2009.  The results are detailed below.


With Rondo
Without Rondo
Season
Games played
Win/loss record (%)
SRS
Pythag. projected wins(1)
SRS projected wins(2)
Games played
Win/loss record (%)
SRS
Pythag. projected wins(1)
SRS projected wins(2)
2008
77
62-15 (80.5)
9.12
63.0 (81.8)
61.6 (80.0)
5
4-1 (80.0)
12.25
4.3 (85.6)
4.5 (90.3)
2009
80
62-18 (77.5)
7.73
60.7 (75.9)
60.4 (75.5)
2
0-2 
(0)
-4.01
0.7 (34.5)
0.7 (36.8)
2010
81
49-32 (60.5)
3.35
51.6 (63.7)
49.4 (61.0)
1
1-0
(100)
5.17
0.7 (67.9)
0.7 (67.0)
2011
68
47-21 (69.1)
5.10
46.8 (68.9)
45.4 (68.8)
14
9-5 (64.3)
3.49
9.0 (64.2)
8.6 (61.5)
2012
53
31-22 (58.5)
1.69
32.0 (60.3)
29.5 (55.6)
13
8-5 (61.5)
4.57
8.8 (68.0)
8.5 (65.0)
2013
38
18-20 (47.4)
-1.98
17.8 (46.9)
16.5 (43.5)
43
23-20 (53.5)
0.58
22.6 (25.5)
22.3 (51.9)
(1) Uses the formula points/gm^14/(points/gm^14 + opp points/gm^14) = win %
(2) Uses the formula (SRS*2.7 + 41)/82 = win %

Note that since most of his missed games are in 2013, that year will dominate the analysis, but the pattern of the Celitcs doing fine without him holds up in other years. Even though the Celtics get worse without him in 2011, for example, it wasn't by much considering he's an all-star point guard and the backup was Nate Robinson. The offense collapsed, but the defense clamped down, which is particularly damning for a system like Wins Produced that puts all its defensive eggs into the baskets of individual blocks/steals/rebounds (Rondo's strengths, not Nate's.) However, in our biggest sample set, Boston's offense emerged from the depths when he got injured, and their defense was virtually unchanged -- contrary to popular belief, Boston's resurgence was not from an elite defense finding itself again. So yes, when Boston lost its all-star point guard, the guy who controlled the ball like a quarterback with 11 assists a game, their offense improved by 2.3 points per 100 possessions, even though the replacement was a guy who doesn't really pass, shoot the ball accurately, or shoot the ball often.



With Rondo
Without Rondo
Season
Wins Produced/ 48 mins
Games played
Adjusted offensive efficiency
Adjusted defensive efficiency
Games played
Adjusted offensive efficiency
Adjusted defensive efficiency
2008
0.185
77
111.5
99.3
5
111.8
98.7
2009
0.269
80
111.7
102.8
2
102.1
106.8
2010
0.247
81
108.5
104.3
1
109.9
104.2
2011
0.233
68
108.1
102.1
14
103.1
98.9
2012
0.211
53
101.9
99.7
13
101.2
95.9
2013
0.193
38
102.3
103.7
43
104.6
103.9

You can't tell directly from the numbers like this, which is why you need to actually watch the action on the court -- they redirected the offense through Pierce and balanced the action throughout the team, instead of Rondo dominating the ball and searching for assists.

But what did Wins Produced say would happen? Totaling the results, the metric predicted the team would basically fall off a sharp ledge without Rondo, who normally is worth 10 to 15 wins a year (according to Wins Produced.) In the table below, Wins Pythagorean is for the calculated number of wins based on team strength from points per game and opposing points per game, while SRS is the same but uses adjusted point differential (i.e. SRS.) Using the binomial test, detailed more in the methodology section, where it uses the win percentage of a team to test whether or not another sample of games is significantly different statistically, Wins Produced was inaccurate in its judgement of Rondo's value. Even when you use the very high estimate of 0.100 WP/48 for Rondo's replacements, the p-value range is in between 0.05 to 0.01 -- 0.05 and 0.01 are the typical rejection marks, meaning it's just on the edge of finding significance. P-value is the probability that Wins Produced's predicted wins was within the actual win percentage, but differed due to natural variability. Note that the real number of wins without Rondo was 45, extremely close to the estimate from SRS.


Wins Pythagorean RL: .1
Wins SRS 
RL: .1
Wins Pythagorean
Wins SRS
Predicted
37.6
35.1
33.0
30.5
Actual
46.0
45.3
46.0
45.3
Difference
-8.4
-10.2
-13.0
-14.8
Binomial test, two-tailed
0.0255
0.0126
0.00213
0.000372

Conclusion

In the past six seasons, Rondo has missed 78 games and the Celtics won 45 games. Wins Produced, the magic metric that uses box score stats to "explain" wins, predicted due to losing Rondo and replacing him with a bench that over the years has included Marbury and Nate Robinson the Celtics would only win 37.6 to 30.5 wins. The high estimate comes from assuming Rondo's replacements would be league average players, which is very optimistic. Looking at his actual replacements, the difference between the predicted and actual wins is comfortably below statistical significance. Wins Produced failed to predict how the Celtics would do without Rondo.

This is the problem with only using box score stats -- they certainly tell a story, but only one of many. In a Rondo-centric offense, the Celtics were a disappointing offensive team and relied on midrange jumpers. An elite playmaker should be creating good shot attempts, like open three-pointers and paint attacks. He had a below average rate proportion of rim assists to total assists and three-point assists to total, which is typical for him. Rondo indeed collects a lot of assists, and plenty of rebounds for a point guard, but they can replaced and most of the time the Celitcs are fine or even better off without him. Then there's the issue of his horrendous shooting. In the modern NBA, help defense and quick rotations are everything. If there is a complete non-shooting threat then the defender is free to roam and help elsewhere on the court. Rondo's defenders give him an absurd amount of space, daring him to shoot so much that he actually had a decent percentage this year from the midrange distance, but not at a rate to swing the results. However, Rondo is insistent on looking for assists and doesn't pressure the defense enough with his own scoring. As a result, for an 11-assist all-star point guard, their offense paradoxically hums right along. You can't see that with Wins Produced.

Edited: fixed offensive and defensive efficiency numbers for 2008 and 2009. They did not disrupt the statistical tests.