Sampling Fundamentals - The Right Slice
- Population (N): The entire group a study aims to understand.
- Sample (n): A representative subset of the population from which data is collected.
- The goal is to make inferences about the population.
- Sampling Frame: The specific list of individuals from which the sample is drawn (e.g., a clinic's patient list).
- Sampling Bias: A systematic error where the sample is not representative of the population, threatening the study's external validity.

⭐ Generalizability (External Validity): The degree to which findings can be applied to the broader population. This is highly dependent on how well the sample represents the population.
Probability Sampling - Truly Random Acts
Ensures every member of the population has a known, non-zero chance of being selected, minimizing selection bias. Essential for generalizability (external validity).
-
Simple Random Sampling (SRS)
- Every individual has an equal chance of selection.
- Like a lottery; requires a full population list (sampling frame).
-
Systematic Sampling
- Select individuals at a regular interval (every k-th person) from a list after a random start.
- Efficient, but can be biased if the list has a periodic pattern.
-
Stratified Sampling
- Divide population into homogeneous subgroups (strata), e.g., by age or race.
- Perform SRS within each stratum.
- Guarantees representation of key subgroups.
-
Cluster Sampling
- Divide population into heterogeneous groups (clusters), e.g., hospitals or zip codes.
- Randomly select entire clusters to sample.
- 📌 Mnemonic: "Clusters are mini-populations."
⭐ Stratified sampling increases precision and ensures minority subgroups are adequately represented, boosting statistical power for subgroup analyses.
Non-Probability Sampling - Conveniently Biased
Selection isn't random; it relies on the researcher's judgment or convenience. This introduces selection bias, limiting the generalizability of findings to the broader population.
- Types of Non-Probability Sampling:
- Convenience Sampling: Choosing easily accessible subjects (e.g., patients in a single clinic). Very prone to selection bias.
- Quota Sampling: Filling pre-set quotas for subgroups (e.g., 50 men, 50 women) in a non-random way.
- Purposive (Judgmental) Sampling: Researcher handpicks subjects based on specific criteria or expertise.
- Snowball Sampling: Participants recruit other eligible participants. Useful for hard-to-reach or hidden populations.

⭐ Key limitation: Because the sample is not representative, findings from non-probability sampling cannot be generalized to the entire population. The study has low external validity.
Sampling Biases - Dodging Disasters
- Selection Bias: Sample is not representative of the target population, limiting external validity.
- Ascertainment Bias: Nonrandom sampling creates a skewed sample (e.g., using only hospitalized patients).
- Nonresponse Bias: Participants differ significantly from non-participants.
- Berkson Bias: Hospital-based samples show higher disease prevalence vs. general population.
- Healthy Worker Effect: Working populations are healthier than the general population.
⭐ Neyman (Prevalence-Incidence) Bias: In case-control studies, missing severe or rapidly fatal cases leads to a non-representative sample.
High‑Yield Points - ⚡ Biggest Takeaways
- Random sampling is crucial for generalizability (external validity), allowing inferences about a larger population.
- Stratified sampling ensures specific subgroups are adequately represented, improving precision for those groups.
- Cluster sampling randomly selects natural groups (e.g., hospital wards), offering convenience but with lower precision.
- Convenience sampling is highly susceptible to selection bias, severely limiting external validity.
- Random sampling minimizes selection bias; randomization in trials minimizes confounding.
Continue reading on Oncourse
Sign up for free to access the full lesson, plus unlimited questions, flashcards, AI-powered notes, and more.
CONTINUE READING — FREEor get the app