Sampling Fundamentals - The Who & Why
- Population (N): Entire group of interest.
- Sample (n): Subset of N, chosen for study.
- Sampling Frame: List of all units in N for sample selection.
- Parameter: Population characteristic (e.g., $\mu$).
- Statistic: Sample characteristic (e.g., $\bar{x}$), estimates parameter.
- Purpose: Feasibility (cost, time); make inferences about N.
- Errors:
- Sampling Error: Sample-population discrepancy; ↓ with ↑ n.
- Non-Sampling Error: Due to measurement/processing flaws.
⭐ The primary goal of sampling is to draw inferences about a larger population based on a smaller, representative subset, balancing precision with practicality.
Probability Sampling - Everyman's Chance
- Core principle: Every member has a known, non-zero selection chance. Allows generalization to the population.
- Types:
- Simple Random Sampling (SRS):
- Each unit has an equal and independent chance of selection.
- Methods: Lottery, random number tables/generator.

- Systematic Sampling:
- Select units at regular intervals (e.g., every $k^{th}$ unit).
- Sampling interval $k = N/n$ (N=population size, n=sample size).
- Requires a random start; can be biased if there's periodicity in the list.
- Stratified Sampling:
- Population divided into homogeneous subgroups (strata) based on specific characteristics (e.g., age, sex).
- SRS or systematic sampling is then done within each stratum.
- Ensures representation of key subgroups; increases precision.
⭐ Stratified sampling is preferred when the population is heterogeneous, and specific subgroups need to be proportionally represented to increase precision and reduce sampling error for subgroup estimates.
- Cluster Sampling:
- Population divided into clusters (often geographic, e.g., villages, schools).
- Randomly select clusters; sample all units or a sample of units within selected clusters.
- Cost-effective for large, dispersed populations; may ↑ sampling error (design effect).
- Multistage Sampling:
- Complex form involving sampling in multiple stages (e.g., states → districts → villages → households).
- Simple Random Sampling (SRS):
Non-Probability Sampling - Quick Picks & Quirks
- Subject selection is non-random, based on convenience or researcher judgment.
- Major Drawback: Findings not generalizable; high selection bias risk.
- Common in exploratory research or when random sampling is impractical.
- Methods:
- Convenience: Easiest to reach subjects. Fast, cheap; high bias.
- Purposive (Judgmental): Researcher selects based on specific traits/expertise.
- Quota: Non-random selection to fill subgroup quotas (e.g., age, gender).
- Snowball: Initial subjects refer subsequent ones.
⭐ Snowball sampling is particularly useful for accessing hidden, hard-to-reach, or socially networked populations (e.g., drug users, rare disease patients).
Sampling Errors & Bias - Data Tripwires
- Sampling Error (Random Error):
- Difference between sample statistic & true population parameter due to chance.
- Unavoidable; inherent to sampling.
- Magnitude ↓ with ↑ sample size ($n$).
- Quantified by Standard Error (SE): $SE = \frac{\sigma}{\sqrt{n}}$.
- Non-Sampling Error (Bias/Systematic Error):
- Systematic deviation from the true value; not due to chance.
- Leads to inaccurate (invalid) results; not reduced by ↑ $n$.
- Major Types:
- Selection Bias: Sample not representative of the target population.
- Examples: Sampling bias (faulty technique), volunteer bias, non-response bias, Berkson's bias (hospital-based studies), Neyman bias (incidence-prevalence bias; e.g., missing fatal/mild cases). 📌 Neyman: No early/mild/dead.
- Information Bias (Measurement/Observation Bias): Errors in data collection or measurement.
- Examples: Recall bias, interviewer bias, observer bias, misclassification bias.

- Examples: Recall bias, interviewer bias, observer bias, misclassification bias.
- Selection Bias: Sample not representative of the target population.
⭐ Selection bias, where the sample is not representative of the population due to systematic differences in choosing participants, is a critical flaw that can invalidate study conclusions.
High‑Yield Points - ⚡ Biggest Takeaways
- Simple Random Sampling (SRS): Equal chance of selection for all units; best for homogeneous populations.
- Stratified Sampling: Divides population into homogeneous strata; SRS within each. ↑Precision, ↓error.
- Systematic Sampling: Selects units at regular intervals (k-th unit). Easy, but risk of periodicity bias.
- Cluster Sampling: Randomly selects intact groups (clusters). Cost-effective for dispersed populations; ↑sampling error vs SRS.
- Sampling Error: Inversely proportional to the square root of the sample size; ↑sample size, ↓error.
- Non-probability methods (e.g., convenience, quota) are biased; results not generalizable.
Continue reading on Oncourse
Sign up for free to access the full lesson, plus unlimited questions, flashcards, AI-powered notes, and more.
CONTINUE READING — FREEor get the app