Sampling Methods

Sampling Fundamentals - The Who & Why

Population (N): Entire group of interest.
Sample (n): Subset of N, chosen for study.
Sampling Frame: List of all units in N for sample selection.
Parameter: Population characteristic (e.g., $\mu$).
Statistic: Sample characteristic (e.g., $\bar{x}$), estimates parameter.
Purpose: Feasibility (cost, time); make inferences about N.
Errors:
- Sampling Error: Sample-population discrepancy; ↓ with ↑ n.
- Non-Sampling Error: Due to measurement/processing flaws.

⭐ The primary goal of sampling is to draw inferences about a larger population based on a smaller, representative subset, balancing precision with practicality.

Probability Sampling - Everyman's Chance

Core principle: Every member has a known, non-zero selection chance. Allows generalization to the population.
Types:
- Simple Random Sampling (SRS):
  - Each unit has an equal and independent chance of selection.
  - Methods: Lottery, random number tables/generator.
- Systematic Sampling:
  - Select units at regular intervals (e.g., every $k^{th}$ unit).
  - Sampling interval $k = N/n$ (N=population size, n=sample size).
  - Requires a random start; can be biased if there's periodicity in the list.
- Stratified Sampling:
  - Population divided into homogeneous subgroups (strata) based on specific characteristics (e.g., age, sex).
  - SRS or systematic sampling is then done within each stratum.
  - Ensures representation of key subgroups; increases precision.
  ⭐ Stratified sampling is preferred when the population is heterogeneous, and specific subgroups need to be proportionally represented to increase precision and reduce sampling error for subgroup estimates.
- Cluster Sampling:
  - Population divided into clusters (often geographic, e.g., villages, schools).
  - Randomly select clusters; sample all units or a sample of units within selected clusters.
  - Cost-effective for large, dispersed populations; may ↑ sampling error (design effect).
- Multistage Sampling:
  - Complex form involving sampling in multiple stages (e.g., states → districts → villages → households).

Non-Probability Sampling - Quick Picks & Quirks

Subject selection is non-random, based on convenience or researcher judgment.
Major Drawback: Findings not generalizable; high selection bias risk.
Common in exploratory research or when random sampling is impractical.
Methods:
- Convenience: Easiest to reach subjects. Fast, cheap; high bias.
- Purposive (Judgmental): Researcher selects based on specific traits/expertise.
- Quota: Non-random selection to fill subgroup quotas (e.g., age, gender).
- Snowball: Initial subjects refer subsequent ones.
  
  ⭐ Snowball sampling is particularly useful for accessing hidden, hard-to-reach, or socially networked populations (e.g., drug users, rare disease patients).

Sampling Errors & Bias - Data Tripwires

Sampling Error (Random Error):
- Difference between sample statistic & true population parameter due to chance.
- Unavoidable; inherent to sampling.
- Magnitude ↓ with ↑ sample size ($n$).
- Quantified by Standard Error (SE): $SE = \frac{\sigma}{\sqrt{n}}$.
Non-Sampling Error (Bias/Systematic Error):
- Systematic deviation from the true value; not due to chance.
- Leads to inaccurate (invalid) results; not reduced by ↑ $n$.
- Major Types:
  - Selection Bias: Sample not representative of the target population.
    - Examples: Sampling bias (faulty technique), volunteer bias, non-response bias, Berkson's bias (hospital-based studies), Neyman bias (incidence-prevalence bias; e.g., missing fatal/mild cases). 📌 Neyman: No early/mild/dead.
  - Information Bias (Measurement/Observation Bias): Errors in data collection or measurement.
    - Examples: Recall bias, interviewer bias, observer bias, misclassification bias.

⭐ Selection bias, where the sample is not representative of the population due to systematic differences in choosing participants, is a critical flaw that can invalidate study conclusions.

High‑Yield Points - ⚡ Biggest Takeaways

Simple Random Sampling (SRS): Equal chance of selection for all units; best for homogeneous populations.

Stratified Sampling: Divides population into homogeneous strata; SRS within each. ↑Precision, ↓error.

Systematic Sampling: Selects units at regular intervals (k-th unit). Easy, but risk of periodicity bias.

Cluster Sampling: Randomly selects intact groups (clusters). Cost-effective for dispersed populations; ↑sampling error vs SRS.

Sampling Error: Inversely proportional to the square root of the sample size; ↑sample size, ↓error.

Non-probability methods (e.g., convenience, quota) are biased; results not generalizable.

Sampling Methods

On this page

Sampling Fundamentals - The Who & Why

Probability Sampling - Everyman's Chance

Non-Probability Sampling - Quick Picks & Quirks

Sampling Errors & Bias - Data Tripwires

High‑Yield Points - ⚡ Biggest Takeaways

Practice Questions: Sampling Methods

Flashcards: Sampling Methods

Enjoying this lesson?