Measures of Dispersion

On this page

Measures of Dispersion: Introduction & Range - The Spread Spectrum

  • Measures of Dispersion (MD): Quantify the spread or variability of data around a central tendency. Indicate data homogeneity/heterogeneity.
  • Range: Simplest and crudest measure.
    • Definition: Difference between the highest ($H$) and lowest ($L$) observed values.
    • Formula: $Range = H - L$
    • Merits: Easy to calculate and understand.
    • Demerits: Highly affected by extreme values (outliers); ignores data distribution between extremes.

⭐ Range is highly susceptible to sampling fluctuations and is not based on all observations.

Measures of Dispersion: Quartiles, IQR & QD - Middle Ground Masters

  • Quartiles: Values dividing ranked data into four equal parts.
    • $Q_1$ (Lower Quartile): 25th percentile.
    • $Q_2$ (Median): 50th percentile.
    • $Q_3$ (Upper Quartile): 75th percentile.
  • Interquartile Range (IQR): Difference between $Q_3$ and $Q_1$; measures spread of middle 50% of data.
    • Formula: $IQR = Q_3 - Q_1$.
    • Robust to outliers.
  • Quartile Deviation (QD): Half the IQR; also called semi-interquartile range.
    • Formula: $QD = (Q_3 - Q_1)/2$.
  • Key for constructing Box & Whisker plots, which visually represent data distribution and presence of outliers.

Boxplot illustrating Q1, Q2, Q3, IQR, and outliers

⭐ IQR is a robust measure of spread, preferred over range for skewed data or data with outliers, as it is not affected by extreme values in the dataset's tails.

Measures of Dispersion: Mean Deviation - Average Absolutes

  • Definition: The average of absolute deviations of observations from a central tendency measure (mean, median, or mode).
  • Formula (from arithmetic mean $\bar{X}$): $MD = \frac{\sum |X_i - \bar{X}|}{N}$
    • $|X_i - \bar{X}|$: Absolute deviation of an observation $X_i$.
  • Characteristics:
    • Based on all observations.
    • Absolute values used, ignoring signs of deviations.
    • Less affected by extreme values compared to Standard Deviation.
    • Simpler to understand and compute than SD.

⭐ Mean Deviation is least when deviations are taken about the median.

Measures of Dispersion: Variance & Standard Deviation - The Golden Standards

  • Variance: Average of squared differences from the Mean.
    • Population Variance ($\text{Population Var or } \text{Var(X) or } \text{MSD or } \text{Mean Square Deviation}$): $\sigma^2 = \frac{\sum (X - \mu)^2}{N}$
    • Sample Variance ($s^2$): $s^2 = \frac{\sum (X - \bar{X})^2}{n-1}$ (uses $n-1$ for unbiased estimate)
  • Standard Deviation (SD): Positive square root of Variance. $\sigma = \sqrt{\text{Variance}}$ or $s = \sqrt{\text{Variance}}$.
    • Most widely used & reliable measure; units same as data.
    • Foundation for normal distribution interpretation. ![Normal Distribution](normal distribution)
  • Normal Distribution & Empirical Rule:
    • Mean $\pm 1$ SD: covers approx. 68% of observations.
    • Mean $\pm 2$ SD: covers approx. 95% of observations.
    • Mean $\pm 3$ SD: covers approx. 99.7% of observations.
    • 📌 Mnemonic: Remember 68-95-99.7 for 1, 2, 3 SDs!

⭐ Coefficient of Variation (CV) = $(\frac{SD}{\text{Mean}}) \times 100%$. It's a relative measure of dispersion, useful for comparing variability between datasets with different units or widely differing means.

Measures of Dispersion: Coefficient of Variation - Relative Ruler

  • Definition: A standardized, unitless measure of relative variability, expressed as a percentage.
  • Formula: $CV = (SD / \text{Mean}) \times 100%$
    • SD: Standard Deviation
    • Mean: Arithmetic Mean
  • Interpretation:
    • Higher CV $\rightarrow$ $\uparrow$ relative variability.
    • Lower CV $\rightarrow$ $\downarrow$ relative variability.
  • Use: Compares dispersion of datasets with different units or significantly different means.

    ⭐ CV is preferred over SD for comparing variability between groups with widely different means or different units of measurement (e.g., comparing variability in height (cm) and weight (kg)).

High-Yield Points - ⚡ Biggest Takeaways

  • Range: Simplest measure; highly affected by outliers.
  • Interquartile Range (IQR): Spread of middle 50% of data; not affected by outliers.
  • Variance: Average of squared deviations from the mean; units are squared.
  • Standard Deviation (SD): Square root of variance; most common measure; 68-95-99.7 rule applies for normal distribution.
  • Coefficient of Variation (CV): Relative measure of dispersion (SD/Mean) × 100; compares datasets with different units.
  • Standard Error (SE): Measures precision of sample mean (SD/√n); decreases with ↑ sample size (n).
Rezzy AI Tutor

Have doubts about this lesson?

Ask Rezzy, our AI tutor, to explain anything you didn't understand

Practice Questions: Measures of Dispersion

Test your understanding with these related questions

For testing the statistical significance of the difference in heights among different groups of school children, which statistical test would be most appropriate?

1 of 5

Flashcards: Measures of Dispersion

1/10

Measurements on the kelvin scale or pulse represent _____ scale data

TAP TO REVEAL ANSWER

Measurements on the kelvin scale or pulse represent _____ scale data

ratio

browseSpaceflip

Enjoying this lesson?

Get full access to all lessons, practice questions, and more.

Start For Free