Data Types & Sources - The Raw Material
- Data: Raw facts & figures.
- Types of Data:
- Qualitative (Categorical): Describes qualities.
- Nominal: Unordered categories (blood groups, gender).
- Ordinal: Ordered categories (Likert scale, disease severity: mild/moderate/severe).
- Quantitative (Numerical): Measurable quantities.
- Discrete: Countable, whole numbers (number of children, admissions).
- Continuous: Any value in a range (height, weight, BP).
- Qualitative (Categorical): Describes qualities.
- Sources of Data:
- Primary Data: First-hand collection by investigator (surveys, interviews, experiments).
- Secondary Data: Already collected (hospital records, census, NFHS).
⭐ Hospital records are a vital secondary data source for retrospective studies and health trends.
Sampling Methods - Picking Your Players
Goal: Select a representative subset from a population.
- Probability Sampling (Unbiased):
- Simple Random (SRS): Equal chance for all (e.g., lottery).
- Systematic: Every k-th unit selected ($k = N/n$).
- Stratified: Population divided into homogenous strata; SRS from each. ↑Precision.
- Cluster: Random clusters selected; all units in selected clusters sampled. Often ↑Efficiency, ↓Cost.
- Multistage: Sampling in multiple phases (e.g., national survey).
- Non-Probability Sampling (Bias prone):
- Convenience: Readily available participants.
- Purposive (Judgmental): Investigator's choice based on specific criteria/judgment.
- Quota: Predefined quotas for subgroups to mirror population proportions.
- Snowball: Participants recruit future subjects. Useful for hidden/rare populations.
⭐ Cluster sampling: The unit of sampling is a group (e.g., village, school) rather than an individual. It is particularly useful for large, geographically dispersed populations where SRS is impractical.

Tabular Presentation - Order in the Court
- Systematic data arrangement in rows & columns; aids comparison, analysis.
- Key Parts (📌 Mnemonic: Tall Captains Shout Boldly From Ships):
- Title: Clear, concise (What, Where, When, How classified).
- Captions (Column heads) & Stubs (Row heads).
- Body: Numerical data.
- Footnote (Explanations) & Source Note (Origin).
- Types:
- Simple (One-way): 1 characteristic.
- Complex: Two-way (cross-tab), Three-way, Manifold.
- Principles: Logical, clear, units stated, totals, avoid clutter.
⭐ Cross-tabulation (two-way table) is pivotal for exploring associations between two categorical variables. A well-structured table with labeled parts like title, caption, stubs, body, footnote, and source note.)
Graphical Presentation - A Picture's Worth
- Visuals for rapid data interpretation: patterns, trends, relationships.
- Histogram:
- Continuous quantitative data (e.g., age). Bars adjacent (touch).
- Area of bar proportional to frequency. X-axis: class intervals; Y-axis: frequency.
- Bar Chart:
- Qualitative or discrete data (e.g., gender). Bars separated.
- Length of bar proportional to frequency.
- Types: Simple, Multiple, Component/Proportional.
- Pie Chart (Sector Diagram):
- Proportions of whole (categorical data, e.g., causes of death).
- Total angle 360°. Sector angle = (Component value / Total value) × 360°.
- Line Chart/Graph:
- Trends over time (time-series data, e.g., disease cases/year).
- Frequency Polygon:
- Joins histogram tops' midpoints. Smooths data; compares distributions.
- Scatter Diagram (Correlation Plot):
- Relationship & strength between two quantitative variables. Pattern shows correlation.
- Box & Whisker Plot:
- Min, Q1, Median, Q3, Max. Shows data spread, central tendency, outliers.
- Compares group distributions.

⭐ Ogives (cumulative frequency curves) graphically determine median from grouped data.
High‑Yield Points - ⚡ Biggest Takeaways
- Primary data: First-hand collection. Secondary data: Pre-existing information.
- Sampling: Simple random (equal chance), stratified (subgroup representation).
- Qualitative data: Nominal (e.g., blood type), Ordinal (e.g., pain scale).
- Quantitative data: Discrete (e.g., hospital admissions), Continuous (e.g., height).
- Presentation: Histograms for continuous data; Bar charts for categorical/discrete data.
- Frequency polygon joins histogram midpoints; Ogive (cumulative frequency) for percentiles.
- Pie charts for proportions; Scatter plots for relationships between two variables.
Continue reading on Oncourse
Sign up for free to access the full lesson, plus unlimited questions, flashcards, AI-powered notes, and more.
CONTINUE READING — FREEor get the app