Arithmetic Mean - The Balancing Act
- Most common measure; the dataset's "center of gravity".
- Calculation: Sum of all values divided by the count of values.
- Formula: $\bar{x} = \frac{\sum x_i}{n}$
- Where $\sum x_i$ = sum of individual values, $n$ = number of values.
- Key Properties:
- Simple to compute & interpret.
- Includes all data points in calculation.
- Highly affected by extreme values (outliers).
- Sum of deviations of values from their mean is always 0: $\sum (x_i - \bar{x}) = 0$.
- Best for symmetrical data; less suitable for skewed distributions.
⭐ For a moderately skewed distribution, the empirical relationship is: Mean - Mode ≈ 3 (Mean - Median).
Median - The Resilient Middle
- Middlemost value in an ordered dataset (ascending/descending).
- Represents the 50th percentile; divides distribution into two equal halves.
- Calculation:
- Odd (n) observations: The $(\frac{n+1}{2})^{th}$ value.
- Even (n) observations: Average of the $(\frac{n}{2})^{th}$ and $(\frac{n}{2} + 1)^{th}$ values.
- Key advantage: Unaffected by extreme values (outliers), making it robust.
- Preferred for skewed data (e.g., income, hospital stay duration, incubation period).
- 📌 "Median in the Middle" - stays central despite extreme pulls.

⭐ For skewed distributions (e.g., income data, incubation periods), Median is a more robust and often preferred measure of central tendency over Mean, as it is not influenced by outliers.
Mode - The Popular Peak
- Definition: The value that appears most frequently in a data set.
- Key Characteristics:
- Represents the most common observation.
- Can be:
- Unimodal (one mode)
- Bimodal (two modes)
- Multimodal (>2 modes)
- No mode (all values occur equally)
- Unaffected by extreme values (outliers).
- Only average for nominal (categorical) data.
- Empirical Relationship (moderately skewed distributions):
- $Mode \approx 3 \cdot Median - 2 \cdot Mean$
- Advantages: Easy to understand; useful for qualitative data & identifying the most frequent size/category.
- Disadvantages: Not always unique/may not exist; not based on all values; poor for algebraic manipulation.
⭐ For a J-shaped distribution, the mode is at the highest point, which is at one end of the distribution.

Relationships & Selection - The Skewed Showdown
- Distribution & Central Tendency:
- Symmetrical: Mean = Median = Mode.
- Positively Skewed (Right): Mean > Median > Mode. 📌 Mean pulled by right tail.
- Negatively Skewed (Left): Mean < Median < Mode. 📌 Mean pulled by left tail.
- Empirical Rule (Moderately Skewed, Unimodal):
- $Mode \approx 3 \times Median - 2 \times Mean$.
- Selection Guide:
- Mean: Best for normal quantitative data; sensitive to outliers.
- Median: Best for skewed quantitative data, data with outliers, or ordinal data.
- Mode: Best for nominal data; useful for bimodal/multimodal or qualitative data.

⭐ The Median is the most robust measure of central tendency for skewed distributions or data with extreme outliers as it is least affected by them.
High‑Yield Points - ⚡ Biggest Takeaways
- Mean (average) is most affected by outliers; use for normal distributions.
- Median (middle value, 50th percentile) is robust to outliers; best for skewed data.
- Mode is the most frequent value; useful for categorical/nominal data.
- Symmetrical distribution: Mean = Median = Mode.
- Positively skewed (right tail): Mean > Median > Mode.
- Negatively skewed (left tail): Mean < Median < Mode.
- Geometric Mean for growth rates/ratios; Harmonic Mean for average rates.
Continue reading on Oncourse
Sign up for free to access the full lesson, plus unlimited questions, flashcards, AI-powered notes, and more.
CONTINUE READING — FREEor get the app