Topic: Performance, Performance Appraisal
Publication: Personnel Psychology
Article: The best and the rest: Revisiting the norm of normality of individual
performance
Authors: O’Boyle Jr., E., & Aguinis, H.
Reviewer: Neil Morelli
The gloves are off because O’Boyle and Aguinis have just challenged a perennial assumption of the performance literature. What kind of challenge you say? The authors advocate that the distribution of individual performance does not follow a normal, or Gaussian distribution, but rather a power, or Paretian distribution. On the surface this challenge may seem academic, but if true this conclusion could have serious implications for how performance, and the methods and tools used to assess it, are conceptualized and valued.
We are all too familiar with the inverted U-shaped normal distribution and its inferences that most performers hang out around the mean, while any extreme scores or deviation from this shape indicate bias or error. Instead, O’Boyle and Aguinis embrace extreme scores by arguing that the underlying distribution of performance more closely follows the ski jump-shaped Paretian distribution. In this distribution the tails are fatter and extend farther than the normal distribution, and extreme events are more accurately predicted. A helpful way to think about this distribution is the 80/20 rule common to economics—20% of performers are responsible for 80% of the results.
O’Boyle and Aguinis tested this assumption by collecting performance outcomes from 198 samples that spanned an eclectic mix of researchers, entertainers, politicians, and athletes. They compared chi-square values between models that forced the data to fit to a normal, Gaussian distribution and a power, Paretian distribution. They found that 93% of their samples fit to a Paretian distribution better than a Gaussian distribution; in other words, most of the performance outcomes were generated by a small group of superstar performers.
What does this mean for researchers? The generally accepted practice of removing outliers and defaulting to statistical tests that assume a normal distribution when studying performance outcomes may need to be rethought. Practitioners? Utility analysis, which shows the ROI of performance measurement, can be more accurate by working under this new assumption. Also, measures that track performance or are intended to select high performers may need to be readjusted to account for the “superstar effect.” Overall, the authors suggest that organizations would be well served by properly identifying, managing, compensating, and leveraging their elite performers.
Definitely interesting. I’m still in the middle of reading it…as I read, I’m wondering about some kind of contextual asymptote.
Essentially – is a pareto distribution more meaningful at explaining differences for exceptional performance standards?
For example, take the grammy award. You can be a very successful artist, but not have the award (or any nominations). In fact, I would speculate that most successful artists have not won grammy awards. So, when we think about performance here, we are differentiating between good or very good and incredibly exceptional.
Notably, we aren’t really talking about or including average, better than average but not good (or very good, or incredibly exceptionally good) or poor (to incredibly exceptionally poor) performers. So, we have a limited range of performance.
Clearly then the Pareto is better at explaining this limited range because within our limited range we have a higher ratio of exceptional to non-exceptional performers (something a normal distribution wouldn’t predict as well as Pareto would).
The same logic would then apply to the other tail. In a limited range of exceptional to non-exceptional performers the ratio of the first to the second would be higher than you would expect in a random sample.
Imagine then, that as you move across a continuum the distribution which best defines the range-of-interest changes based on the general position and “distance” between the low and high cut-points of the range. In a sense, the variance within the variance is not fractal-like but instead depends upon the general position and distance of the range. As you slide from left to right, you see Pareto transition into Normal, transition into Pareto; or a Pareto-Normal distribution.
What will then determine the overall shape of the distribution probably depends on the relative size (n) of each of the sub-variances within the global variance and the range(s) of interest.
So…stepping back to a performance distribution in a workplace. The Pareto is better at explaining the tails, but the Normal is better at explaining the middle. If we’re interesting in exploring a high performance or low performance population, Pareto will be a better assumption to make. The challenge is trying to estimate the relative fatness of the tails and size of the middle.
The danger, if you switched from an entirely “normal distribution” of performance to an entire “pareto distribution” of performance is that suddenly your workforce is dangerously overloaded with poor contributors (80/20).
If this were true, the majority of the workforce could be removed with no substantive impact on performance. Our intuition challenges this hypothesis.
Even if 80/20 was reasonable, we must assume that the “20″ who achieve the “80″ are doing it with a significant degree of interdependence on the remaining 80% of workers. This would suggest that the majority of the remaining 80% at not poor performers.
I also wonder about the stability of the chi-square. E.g., Is the difference in badness of fit from 1 to 10, the same as from 200 and 210 or 3 E+10 to 3.1E+10?
At a certain level, I wonder how much is it meaningful to talk about very hideous fit and extremely, OMG-hideous fit?
I don’t know if this analogy works, but imagine a shirt…one is designed for an 50 foot man and doesn’t fit at all, and the other for an 100 foot dinosaur and doesn’t fit at all. At what point does non fit matter?
Any way…excellent food for thought.
Thank you