Measures of Dispersion

On this page

Measures of Dispersion: Introduction & Range - The Spread Spectrum

  • Measures of Dispersion (MD): Quantify the spread or variability of data around a central tendency. Indicate data homogeneity/heterogeneity.
  • Range: Simplest and crudest measure.
    • Definition: Difference between the highest ($H$) and lowest ($L$) observed values.
    • Formula: $Range = H - L$
    • Merits: Easy to calculate and understand.
    • Demerits: Highly affected by extreme values (outliers); ignores data distribution between extremes.

⭐ Range is highly susceptible to sampling fluctuations and is not based on all observations.

Measures of Dispersion: Quartiles, IQR & QD - Middle Ground Masters

  • Quartiles: Values dividing ranked data into four equal parts.
    • $Q_1$ (Lower Quartile): 25th percentile.
    • $Q_2$ (Median): 50th percentile.
    • $Q_3$ (Upper Quartile): 75th percentile.
  • Interquartile Range (IQR): Difference between $Q_3$ and $Q_1$; measures spread of middle 50% of data.
    • Formula: $IQR = Q_3 - Q_1$.
    • Robust to outliers.
  • Quartile Deviation (QD): Half the IQR; also called semi-interquartile range.
    • Formula: $QD = (Q_3 - Q_1)/2$.
  • Key for constructing Box & Whisker plots, which visually represent data distribution and presence of outliers.

Boxplot illustrating Q1, Q2, Q3, IQR, and outliers

⭐ IQR is a robust measure of spread, preferred over range for skewed data or data with outliers, as it is not affected by extreme values in the dataset's tails.

Measures of Dispersion: Mean Deviation - Average Absolutes

  • Definition: The average of absolute deviations of observations from a central tendency measure (mean, median, or mode).
  • Formula (from arithmetic mean $\bar{X}$): $MD = \frac{\sum |X_i - \bar{X}|}{N}$
    • $|X_i - \bar{X}|$: Absolute deviation of an observation $X_i$.
  • Characteristics:
    • Based on all observations.
    • Absolute values used, ignoring signs of deviations.
    • Less affected by extreme values compared to Standard Deviation.
    • Simpler to understand and compute than SD.

⭐ Mean Deviation is least when deviations are taken about the median.

Measures of Dispersion: Variance & Standard Deviation - The Golden Standards

  • Variance: Average of squared differences from the Mean.
    • Population Variance ($\text{Population Var or } \text{Var(X) or } \text{MSD or } \text{Mean Square Deviation}$): $\sigma^2 = \frac{\sum (X - \mu)^2}{N}$
    • Sample Variance ($s^2$): $s^2 = \frac{\sum (X - \bar{X})^2}{n-1}$ (uses $n-1$ for unbiased estimate)
  • Standard Deviation (SD): Positive square root of Variance. $\sigma = \sqrt{\text{Variance}}$ or $s = \sqrt{\text{Variance}}$.
    • Most widely used & reliable measure; units same as data.
    • Foundation for normal distribution interpretation. ![Normal Distribution](normal distribution)
  • Normal Distribution & Empirical Rule:
    • Mean $\pm 1$ SD: covers approx. 68% of observations.
    • Mean $\pm 2$ SD: covers approx. 95% of observations.
    • Mean $\pm 3$ SD: covers approx. 99.7% of observations.
    • 📌 Mnemonic: Remember 68-95-99.7 for 1, 2, 3 SDs!

⭐ Coefficient of Variation (CV) = $(\frac{SD}{\text{Mean}}) \times 100%$. It's a relative measure of dispersion, useful for comparing variability between datasets with different units or widely differing means.

Measures of Dispersion: Coefficient of Variation - Relative Ruler

  • Definition: A standardized, unitless measure of relative variability, expressed as a percentage.
  • Formula: $CV = (SD / \text{Mean}) \times 100%$
    • SD: Standard Deviation
    • Mean: Arithmetic Mean
  • Interpretation:
    • Higher CV $\rightarrow$ $\uparrow$ relative variability.
    • Lower CV $\rightarrow$ $\downarrow$ relative variability.
  • Use: Compares dispersion of datasets with different units or significantly different means.

    ⭐ CV is preferred over SD for comparing variability between groups with widely different means or different units of measurement (e.g., comparing variability in height (cm) and weight (kg)).

High-Yield Points - ⚡ Biggest Takeaways

  • Range: Simplest measure; highly affected by outliers.
  • Interquartile Range (IQR): Spread of middle 50% of data; not affected by outliers.
  • Variance: Average of squared deviations from the mean; units are squared.
  • Standard Deviation (SD): Square root of variance; most common measure; 68-95-99.7 rule applies for normal distribution.
  • Coefficient of Variation (CV): Relative measure of dispersion (SD/Mean) × 100; compares datasets with different units.
  • Standard Error (SE): Measures precision of sample mean (SD/√n); decreases with ↑ sample size (n).

Practice Questions: Measures of Dispersion

Test your understanding with these related questions

In a normal distribution, one standard deviation from the mean includes approximately:

1 of 5

Flashcards: Measures of Dispersion

1/10

Measurements on the kelvin scale or pulse represent _____ scale data

TAP TO REVEAL ANSWER

Measurements on the kelvin scale or pulse represent _____ scale data

ratio

browseSpaceflip

Enjoying this lesson?

Get full access to all lessons, practice questions, and more.

Start Your Free Trial