Sample size for non-inferiority trials

Sample size for non-inferiority trials

Sample size for non-inferiority trials

On this page

Non-Inferiority Trials - Not Worse, Just Different

  • Goal: To show a new treatment is not unacceptably worse than the standard. Used when new options offer other benefits (e.g., ↑safety, ↓cost).
  • Non-Inferiority Margin (Δ): The pre-specified, largest clinically acceptable difference to still be considered "good enough."
  • Hypotheses:
    • H₀ (Null): The new treatment is inferior (Difference > Δ).
    • H₁ (Alternative): The new treatment is non-inferior (Difference ≤ Δ).
  • Sample Size: Influenced by α, β (power), variance, and Δ. A smaller, stricter margin (↓Δ) requires a ↑ sample size.

Non-inferiority trial confidence interval interpretation

⭐ For non-inferiority to be claimed, the entire confidence interval for the treatment effect difference must be less than the non-inferiority margin (Δ).

NI Sample Size - The Secret Sauce

  • Goal: Prove a new treatment is not unacceptably worse than the standard. The sample size hinges on the non-inferiority margin (δ).

  • Core Formula (per group):

    • $n = \frac{(Z_{\alpha} + Z_{\beta})^2 \times (2\sigma^2)}{(\Delta - \delta)^2}$
    • $\Delta$: Assumed true difference between treatments.
    • $\delta$: The pre-defined non-inferiority margin.
  • Key Relationship: The required sample size is highly sensitive to the gap between the true effect ($\Delta$) and the NI margin ($\delta$).

Exam Pearl: Counterintuitively, non-inferiority trials often require a larger sample size than superiority trials, especially if the new drug's efficacy is expected to be very similar to the standard (i.e., Δ is small).

The Formula - Cranking the Numbers

  • Calculates subjects needed to prove a new treatment is not unacceptably worse than standard treatment.
  • Formula for continuous outcomes (per group): $$ n = \frac{2 \sigma^2 (Z_{\alpha} + Z_{\beta})^2}{(\Delta - \delta)^2} $$
    • Key Inputs:
      • $Z_{\alpha}$: Significance level (e.g., 1.96 for α=0.025)
      • $Z_{\beta}$: Statistical power (e.g., 0.84 for 80% power)
      • $\sigma^2$: Data variability (variance)
      • $\delta$: The non-inferiority margin (critical value)
      • $\Delta$: Expected difference in effect (often assumed to be 0)
  • Sample Size Drivers:
    • Sample size ↑ as power ↑, significance ↑ (α ↓), or variance ↑.
    • Crucially, sample size ↑ dramatically as the margin (δ) ↓ (becomes stricter).

⭐ The non-inferiority margin (δ) is the most critical choice. It must be smaller than the active control's established benefit over placebo, ensuring the new drug preserves a clinically meaningful effect.

Sample Size Levers - Dialing It In

  • Non-Inferiority Margin (δ): The most critical lever.
    • Smaller (stricter) margin → ↑ sample size.
    • Larger (lenient) margin → ↓ sample size.
  • Power (1-β):
    • Higher power (e.g., 90% vs 80%) → ↑ sample size. Reduces Type II error risk.
  • Significance Level (α):
    • Lower α (e.g., 0.01) → ↑ sample size. Reduces Type I error risk.
  • Outcome Variability (σ²):
    • Higher data variability → ↑ sample size for precise estimates.

⭐ The non-inferiority margin (δ) isn't arbitrary. It's set based on historical data of the active control's effect over a placebo, ensuring the new drug preserves a clinically meaningful effect.

High‑Yield Points - ⚡ Biggest Takeaways

  • The goal is to show a new treatment is not unacceptably worse than the standard one.
  • A pre-specified non-inferiority margin (δ) sets the boundary of acceptable difference.
  • Success requires the entire confidence interval of the effect to be above -δ.
  • Sample size is driven by the margin (δ), power (1-β), and significance (α).
  • A smaller (stricter) margin demands a larger sample size to achieve adequate power.
  • If the CI crosses -δ, the result is inconclusive, not a confirmation of inferiority.

Practice Questions: Sample size for non-inferiority trials

Test your understanding with these related questions

A study is funded by the tobacco industry to examine the association between smoking and lung cancer. They design a study with a prospective cohort of 1,000 smokers between the ages of 20-30. The length of the study is five years. After the study period ends, they conclude that there is no relationship between smoking and lung cancer. Which of the following study features is the most likely reason for the failure of the study to note an association between tobacco use and cancer?

1 of 5

Flashcards: Sample size for non-inferiority trials

1/10

A method of statistical analysis that pools summary data (ex. means, RRs) from multiple studies for a more precise estimate of the size of an effect is known as _____

TAP TO REVEAL ANSWER

A method of statistical analysis that pools summary data (ex. means, RRs) from multiple studies for a more precise estimate of the size of an effect is known as _____

meta-analysis

browseSpaceflip

Enjoying this lesson?

Get full access to all lessons, practice questions, and more.

Start Your Free Trial