Skip to content

What are the disadvantages of percentiles? A Closer Look at Statistical Limitations

4 min read

In medicine and health research, percentiles are often used to simplify complex data, such as plotting growth charts for children. However, relying solely on these relative rankings can obscure critical details. Understanding what are the disadvantages of percentiles is crucial for accurate interpretation and decision-making.

Quick Summary

Percentiles can hide raw score differences, oversimplify data distribution, and become unreliable with skewed or small datasets. Their interpretation depends heavily on the specific norm group, and they can mask significant changes, particularly for those in the middle of a data range.

Key Points

  • Loss of Raw Data: Percentiles provide rank but discard the actual scores, making it impossible to know the magnitude of difference between data points.

  • Unequal Intervals: The distance between percentile ranks is not consistent, with less space around the median and more at the extremes, leading to misinterpretations.

  • Norm Group Dependence: The accuracy of percentiles is entirely dependent on the representativeness and size of the reference population used for comparison.

  • Insensitive to Average Change: For individuals in the middle of a normal distribution, large changes in a raw score may result in minimal percentile shifts, masking progress.

  • Distorted Extremes: At the extreme ends of the distribution, tiny raw score changes can cause large percentile shifts, potentially overstating the significance of a change.

  • Risk of Misinterpretation: The unequal spacing of percentiles can mislead individuals into believing all percentile jumps represent the same amount of change.

In This Article

The Hidden Truth Behind Percentile Rankings

While percentiles offer a simple way to express a value's relative position within a dataset, their apparent simplicity belies significant drawbacks. In fields like general health, where accurate data interpretation can directly impact patient care and public health policy, overlooking these limitations can have serious consequences. This deep dive explores the nuanced and often-overlooked disadvantages that challenge the conventional use of percentiles.

Loss of Information and Detail

One of the most fundamental disadvantages of percentiles is the inherent loss of specific data. By boiling down a person's score to a single rank, the actual raw value is discarded. A patient's weight at the 75th percentile, for instance, provides no information on whether they are 10 pounds or 50 pounds heavier than someone at the 50th percentile. This lack of quantitative context makes it impossible to gauge the magnitude of the difference between data points, especially at the ends of the distribution. In a clinical setting, this lost detail can prevent a complete understanding of a patient's condition or progress over time. For instance, two individuals at the 90th percentile for blood pressure may have vastly different raw readings, a nuance lost when only the percentile rank is considered.

Unequal Units and Misinterpretation

Percentiles do not represent equal intervals across the entire range of data. The distances between percentile ranks are distorted, particularly at the extremes of a normal distribution. A jump from the 50th to the 60th percentile, for example, represents a smaller change in raw score than a jump from the 90th to the 99th percentile. This can lead to serious misinterpretations by both patients and clinicians. In fields like neuropsychology, studies have shown that laypersons—and even some educated professionals—can misinterpret percentiles as having equal units, leading to skewed perceptions of performance. This can result in overestimating the significance of changes in the extreme tails and underestimating changes around the mean.

Sensitivity to Sample Composition and Size

Percentile calculations are heavily dependent on the composition and size of the reference or 'norm' group. If the reference group is not representative of the target population, the percentile ranks can be misleading. For example, a percentile for a child's weight may be based on data from a decade ago or a different population, making current comparisons inaccurate.

Furthermore, small datasets can disproportionately affect percentiles. In a small sample, a single new data point can cause a significant shift in all percentile ranks, leading to instability and making year-over-year comparisons unreliable.

Example of Sample Bias

  1. Outdated Norms: Using a growth chart from the 1990s might not accurately reflect modern nutritional standards and population genetics.
  2. Unrepresentative Population: Applying a percentile from a study on a specific ethnic group to a general, diverse population can introduce significant bias.
  3. Small Sample Size: In a study with only 30 participants, one outlier can drastically alter the percentile ranking for every other participant.

Insensitivity to Change Around the Median

For data that follows a normal or bell-shaped distribution, most scores cluster around the median (50th percentile). Within this dense cluster, a large change in a raw score is required to move a person from one percentile to the next. In contrast, at the extreme ends of the distribution, a very small change in the raw score can result in a large percentile shift. This insensitivity near the average can mask important progress or decline in health metrics for the vast majority of people, focusing undue attention on those at the tails.

Comparison Table: Percentiles vs. Standard Scores (Z-Scores)

Feature Percentiles Standard Scores (Z-Scores)
Measurement Scale Ordinal (rank) Interval/Ratio (equal units)
Sensitivity to Change Low around median, high at extremes Consistent across the entire distribution
Information Included Relative position Raw score's distance from the mean
Outlier Impact Not directly affected by magnitude Sensitive to magnitude of outliers
Interpretability Often misinterpreted due to unequal units Requires some statistical understanding to interpret
Best For Communicating relative standing to non-experts Statistical analysis, measuring growth over time

Conclusion: The Need for Context

Ultimately, the disadvantages of percentiles do not mean they are useless, but rather that they must be used with caution and contextual awareness. While intuitive for communicating relative standing, they are poor tools for measuring precise change over time or comparing scores across different populations. For deeper statistical analysis and accurate tracking of growth or change, standard scores and raw data should be the preferred metrics. Clinicians and researchers must educate themselves and the public on the inherent flaws of percentiles to prevent misleading interpretations. In health, this is not a theoretical exercise; it is a practical necessity for ensuring the best possible patient outcomes.

For a more technical discussion on the psychometric properties of different scoring metrics, the National Institutes of Health provides robust resources through its archives on scientific articles, such as those found on the National Library of Medicine website [PMC9796399].

Frequently Asked Questions

Because percentiles only show relative rank, they are poor indicators of an individual's absolute progress. A person could make significant gains in a raw health metric but see little or no change in their percentile rank if the rest of the group also improved.

No, percentiles do not treat all data points equally. They are particularly insensitive to changes in the middle of a distribution where data is clustered, but very sensitive to small changes at the extreme ends.

Comparing percentile ranks between different groups can be misleading because each group may have been measured against a different 'norm' population. To make valid comparisons, the same reference group must be used.

The main challenge is that percentiles represent an ordinal scale, not an equal-interval scale. The difference between the 50th and 60th percentile is not the same as the difference between the 90th and 100th percentile, which can lead to misinterpretation of the magnitude of change.

It is generally inappropriate to use percentiles when the actual magnitude of the score is important, when comparing scores across different measurement periods with different reference groups, or when working with very small datasets.

Percentiles are robust against the magnitude of outliers since they are based on rank. However, a single outlier in a small dataset can significantly shift the percentile ranks of all other data points.

Percentile ranks from different sources can be difficult to compare because the reference populations used to establish the norms are often different, invalidating direct comparisons.

References

  1. 1
  2. 2
  3. 3
  4. 4

Medical Disclaimer

This content is for informational purposes only and should not replace professional medical advice.