Biostatistics Practice Questions

Q: What is the scale of measurement used for classifying a person as 'hypertensive', 'normotensive', or 'hypotensive'?

Ordinal scale. ### Explanation **1. Why Ordinal Scale is Correct:** The classification of blood pressure into categories like **hypotensive, normotensive, and hypertensive** involves data that is qualitative but possesses a **natural rank or order**. In this case, there is a clear progression of severity or magnitude (Hypo < Normo < Hyper). While the exact numerical difference between these categories is not uniform, the relative position is fixed. Therefore, it falls under the **Ordinal Scale** (Order = Ordinal). **2. Why Other Options are Incorrect:** * **Nominal Scale:** This is used for simple labeling without any inherent order (e.g., Gender, Blood Group, or Eye Color). Since "Hypertensive" is objectively "higher" than "Normotensive," it is more than just a label. * **Interval Scale:** This scale has a definite order and equal intervals between units, but **no absolute zero** (e.g., Temperature in Celsius). Clinical categories do not have equal mathematical intervals. * **Ratio Scale:** This is the highest level of measurement and possesses an **absolute zero** (e.g., Height, Weight, or the actual BP reading in mmHg). If the question asked about the *actual systolic value* (e.g., 120 mmHg), the answer would be Ratio. **3. Clinical Pearls & High-Yield Facts for NEET-PG:** * **Mnemonic (NOIR):** **N**ominal (Name), **O**rdinal (Order), **I**nterval (In-between), **R**atio (Real zero). * **Key Distinction:** If a variable is descriptive (Mild, Moderate, Severe), it is **Ordinal**. If it is a raw numerical value (Pulse rate, Glucose level), it is **Ratio**. * **Likert Scales:** (e.g., "Strongly Agree" to "Strongly Disagree") are always **Ordinal**. * **Statistical Test Tip:** For Nominal/Ordinal data, use **Non-parametric tests** (e.g., Chi-square). For Interval/Ratio data, use **Parametric tests** (e.g., T-test, ANOVA).

Q: What is the best method for comparing mortality rates between two populations with different age structures?

Age-adjusted rates. ### Explanation **Why Age-adjusted rates are correct:** Mortality is heavily influenced by age; older populations naturally have higher death rates. When comparing two populations with different age distributions (e.g., a "young" developing country vs. an "old" developed country), a direct comparison of deaths is misleading. **Age-adjusted (standardized) rates** remove the confounding effect of age by applying the observed age-specific death rates to a single "standard population." This ensures that any observed difference in mortality is due to actual health factors rather than simply having more elderly citizens. **Why the other options are incorrect:** * **Crude rates:** These are calculated by dividing total deaths by the total population. They do not account for age distribution, making them unsuitable for comparing populations with different demographics. * **Proportional rates:** These measure the proportion of total deaths attributed to a specific cause (e.g., % of deaths due to CVD). They do not reflect the actual risk of dying in a population and are influenced by changes in other causes of death. **High-Yield NEET-PG Pearls:** * **Standardization Methods:** * **Direct Standardization:** Used when age-specific death rates of the study population are known. * **Indirect Standardization:** Used when age-specific rates are unknown or the population is small. It calculates the **Standardized Mortality Ratio (SMR)**. * **SMR Formula:** (Observed Deaths / Expected Deaths) × 100. * **Gold Standard:** Age-adjustment is the "Gold Standard" for comparing disease frequency or mortality across different geographic areas or time periods.

Q: 95% of the values in a distribution correspond to which of the following number of standard deviations from the mean?

2 standard deviations. ### Explanation This question tests the fundamental concept of the **Normal Distribution (Gaussian Curve)**, which is a cornerstone of biostatistics in medical research. In a perfectly symmetrical, bell-shaped curve, the distribution of data points follows the **Empirical Rule** (also known as the 68-95-99.7 rule). **Why Option B is Correct:** In a normal distribution, the area under the curve represents the probability or percentage of data points. * **Mean ± 1.96 Standard Deviations (SD)** encompasses exactly **95%** of the values. * In most competitive exams like NEET-PG, **1.96 is rounded to 2 SD** for simplicity. Therefore, 95% of the population falls within 2 SD of the mean. **Analysis of Incorrect Options:** * **Option A (1 SD):** Approximately **68.3%** of the values lie within Mean ± 1 SD. This represents the "central" majority of the data. * **Option C (3 SD):** Approximately **99.7%** of the values lie within Mean ± 3 SD. This covers almost the entire distribution, leaving only 0.3% as extreme outliers. * **Option D (4 SD):** This covers **99.99%** of the data. In medical statistics, we rarely use 4 SD as 3 SD already accounts for nearly all biological variation. **High-Yield Clinical Pearls for NEET-PG:** * **Confidence Interval (CI):** The 95% CI is the most commonly used range in medical literature to determine statistical significance. * **Normal Distribution Characteristics:** Mean = Median = Mode. The curve is asymptotic (tails never touch the baseline). * **Z-score:** This indicates how many standard deviations a value is from the mean. A Z-score of 1.96 corresponds to the 95% confidence limit. * **Standard Error vs. Standard Deviation:** SD measures the dispersion of individual values; Standard Error (SE) measures the precision of the sample mean compared to the population mean.

Q: What is the correlation coefficient that best depicts the relationship between age and height in a toddler?

Correlation coefficient = +1. ### Explanation **1. Why Option A is Correct:** The correlation coefficient ($r$) measures the strength and direction of a linear relationship between two variables. In a toddler, growth is a physiological certainty; as age increases, height increases in a predictable, linear fashion. A **Correlation coefficient of +1** represents a **Perfect Positive Correlation**. This means that for every unit increase in age, there is a proportional and consistent increase in height. In biological growth phases, these two variables are so closely linked that they represent the strongest possible positive relationship. **2. Why Other Options are Incorrect:** * **Option B (–1):** This represents a **Perfect Negative Correlation**. This would imply that as a toddler gets older, their height decreases (e.g., the older they get, the shorter they become), which is physiologically impossible. * **Option C (+2):** This is mathematically impossible. The value of the correlation coefficient ($r$) **must always range between –1 and +1**. Any value outside this range (e.g., +2 or –1.5) is invalid. * **Option D:** Incorrect because Option A accurately describes the biological relationship. **3. Clinical Pearls & High-Yield Facts for NEET-PG:** * **Range of $r$:** Always –1 to +1. * **$r = 0$:** Indicates **Zero Correlation** (no linear relationship), such as the relationship between shoe size and intelligence. * **Direction:** Positive (+) means variables move in the same direction; Negative (–) means they move in opposite directions (e.g., Price vs. Demand). * **Strength:** The closer the value is to 1 (regardless of the sign), the stronger the relationship. * **Coefficient of Determination ($r^2$):** This represents the proportion of variance in one variable that is predictable from the other. If $r = 0.8$, then $r^2 = 0.64$ (64% of the change is explained).

Q: Calculate the positive predictive value of an ELISA test for HIV, given a sensitivity of 99%, specificity of 99%, and a prevalence of HIV in the population of 5 per 1000?

33. ### Explanation **1. Understanding the Calculation (Why C is Correct)** Positive Predictive Value (PPV) is the probability that a person who tests positive actually has the disease. It is heavily influenced by the **prevalence** of the disease in the population. To calculate PPV, we can use a hypothetical population of 10,000: * **Prevalence:** 5 per 1,000 = 50 cases in 10,000. * **True Positives (TP):** Sensitivity (99%) of 50 = **49.5** * **False Positives (FP):** 10,000 - 50 = 9,950 healthy people. Specificity is 99%, so the False Positive Rate is 1%. 1% of 9,950 = **99.5** * **PPV Formula:** $TP / (TP + FP) \times 100$ * $49.5 / (49.5 + 99.5) \times 100 = 49.5 / 149 \times 100 \approx \mathbf{33.2\%}$ **2. Analysis of Incorrect Options** * **Option A (10):** This value is too low. While low prevalence reduces PPV, a test with 99% sensitivity/specificity still maintains a moderate PPV at 0.5% prevalence. * **Option B (70):** This would be the PPV if the prevalence were significantly higher (approx. 2-3%). * **Option D (All):** PPV is a specific mathematical derivative based on fixed parameters; it cannot be multiple values simultaneously. **3. High-Yield Clinical Pearls for NEET-PG** * **Prevalence Dependency:** PPV is **directly proportional** to prevalence. As prevalence increases, PPV increases. Conversely, Negative Predictive Value (NPV) is **inversely proportional** to prevalence. * **Screening vs. Diagnostic:** In low-prevalence populations (like general screening), even a highly specific test will yield many false positives. * **Sensitivity/Specificity:** These are inherent properties of the test and do **not** change with disease prevalence, unlike PPV and NPV. * **Formula Shortcut (Bayes' Theorem):** $PPV = \frac{\text{Sensitivity} \times \text{Prevalence}}{(\text{Sensitivity} \times \text{Prevalence}) + (1 - \text{Specificity}) \times (1 - \text{Prevalence})}$

Q: Statistical analysis of data from various studies on the same matter is called as?

Meta-analysis. ### Explanation **Correct Answer: A. Meta-analysis** **Why it is correct:** A **Meta-analysis** is a quantitative, formal, epidemiological study design used to systematically assess the results of previous research to derive conclusions about that body of research. It involves the statistical integration of data from multiple independent studies (usually Randomized Controlled Trials) on the same subject to increase the statistical power and provide a single, more precise estimate of effect (often visualized using a **Forest Plot**). It sits at the very top of the hierarchy of evidence-based medicine. **Why the other options are incorrect:** * **B. Data review:** This is a generic term for examining data. While a "Systematic Review" is a structured qualitative summary of literature, "Data review" lacks the specific statistical synthesis required by the question. * **C. Propaganda:** This refers to biased or misleading information used to promote a particular political cause or point of view; it has no scientific standing in biostatistics. * **D. Cohort study:** This is an observational, longitudinal study where a group of people (exposed and non-exposed) are followed forward in time to determine the incidence of an outcome. It analyzes primary data from one study, not aggregate data from multiple studies. **High-Yield Clinical Pearls for NEET-PG:** * **Forest Plot (Blobbogram):** The graphical representation used in meta-analysis. The diamond at the bottom represents the combined "pooled" result. * **Heterogeneity:** Measured by the **I² statistic**; it tells us how much the results of the included studies vary from each other. * **Publication Bias:** Often assessed using a **Funnel Plot**. If the plot is asymmetrical, publication bias is likely present. * **Hierarchy of Evidence:** Meta-analysis of RCTs > Systematic Reviews > RCTs > Cohort > Case-Control > Case Series > Expert Opinion.

Q: If the birth weight of each of the 10 babies born in a hospital in a day is found to be 2.8 kgs, what will be the standard deviation of this sample?

0. ### Explanation **1. Why the Correct Answer is Right:** Standard Deviation (SD) is a measure of **dispersion** or **variability** in a data set. It quantifies how much the individual values in a sample deviate from the arithmetic mean. In this scenario, every single baby has the exact same birth weight (2.8 kgs). * **Step 1:** Calculate the Mean ($\bar{x}$) = $(2.8 \times 10) / 10 = 2.8$ kgs. * **Step 2:** Calculate the deviation of each value from the mean ($x - \bar{x}$). Since every value is 2.8, the deviation for every baby is $2.8 - 2.8 = 0$. * **Step 3:** Since there is **zero variation** in the data, the Standard Deviation must be **0**. **2. Why the Incorrect Options are Wrong:** * **Option A (2.8 kgs):** This is the mean and the individual value, not the measure of dispersion. * **Option C (1):** A standard deviation of 1 would imply that the weights vary around the mean (e.g., some babies weighing 1.8 kg or 3.8 kg). * **Option D (0.28 kgs):** This is likely a distractor representing 10% of the mean, but it has no mathematical basis in this constant data set. **3. Clinical Pearls & High-Yield Facts for NEET-PG:** * **Definition:** SD is the most commonly used measure of dispersion in medical research. It is the square root of the **Variance**. * **Properties:** If a constant value is added or subtracted from every observation in a dataset, the SD remains **unchanged**. However, if there is no variation (all values are identical), the SD is always zero. * **Normal Distribution:** In a normal (Gaussian) distribution: * Mean ± 1 SD covers **68.3%** of values. * Mean ± 2 SD covers **95.4%** of values. * Mean ± 3 SD covers **99.7%** of values. * **Standard Error (SE):** Do not confuse SD with SE. $SE = SD / \sqrt{n}$. SE measures the variation of sample means, while SD measures variation within a single sample.

Q: A centile divides data into how many equal parts?

100 equal parts. **Explanation:** In biostatistics, **Centiles** (also known as **Percentiles**) are measures of central position that divide a frequency distribution into **100 equal parts**. Each part represents 1% of the total data set. For example, the 50th percentile is the Median, which divides the data into two halves. **Analysis of Options:** * **A. 100 equal parts (Correct):** The term "Centile" is derived from the Latin *centum* (hundred). It indicates the value below which a certain percentage of observations fall. * **B. 10 equal parts (Incorrect):** These are called **Deciles**. The 1st decile is the 10th percentile, and the 5th decile is the Median. * **C. 5 equal parts (Incorrect):** These are called **Quintiles**. Each quintile represents 20% of the data. * **D. 20 equal parts (Incorrect):** These are called **Vigintiles**. Each part represents 5% of the data. **Clinical Pearls & High-Yield Facts for NEET-PG:** * **Quartiles:** Divide data into **4 equal parts** (Q1=25th, Q2=50th/Median, Q3=75th percentile). * **Interquartile Range (IQR):** Calculated as $Q3 - Q1$. It contains the middle 50% of the observations and is the preferred measure of dispersion for skewed data. * **Growth Charts:** In Pediatrics, centiles are used to monitor growth (e.g., a child on the 95th percentile for weight is heavier than 95% of children of the same age/sex). * **Median:** It is the only measure of central tendency that corresponds to the 50th percentile, 5th decile, and 2nd quartile.

Question 1

What is the scale of measurement used for classifying a person as 'hypertensive', 'normotensive', or 'hypotensive'?

Accepted Answer

Ordinal scale

Answer

Interval scale

Answer

Nominal scale

Answer

Ratio scale

Question 2

What is the best method for comparing mortality rates between two populations with different age structures?

Accepted Answer

Age-adjusted rates

Answer

Proportional rates

Answer

None of the above

Answer

Crude rates

Question 3

95% of the values in a distribution correspond to which of the following number of standard deviations from the mean?

Accepted Answer

2 standard deviations

Answer

1 standard deviation

Answer

3 standard deviations

Answer

4 standard deviations

Question 4

What is the correlation coefficient that best depicts the relationship between age and height in a toddler?

Accepted Answer

Correlation coefficient = +1

Answer

Correlation coefficient = -1

Answer

Correlation coefficient = +2

Answer

None of the above

Question 5

Calculate the positive predictive value of an ELISA test for HIV, given a sensitivity of 99%, specificity of 99%, and a prevalence of HIV in the population of 5 per 1000?

Accepted Answer

33

Answer

10

Answer

70

Answer

All

Question 6

What is the P-value?

Accepted Answer

The probability of rejecting a null hypothesis when it is true.

Answer

The probability of not rejecting a null hypothesis when it is true.

Answer

The probability of not rejecting a null hypothesis when it is false.

Answer

The probability of rejecting a null hypothesis when it is false.

Question 7

Statistical analysis of data from various studies on the same matter is called as?

Accepted Answer

Meta-analysis

Answer

Data review

Answer

Propaganda

Answer

Cohort study

Question 8

If the birth weight of each of the 10 babies born in a hospital in a day is found to be 2.8 kgs, what will be the standard deviation of this sample?

Accepted Answer

0

Answer

2.8 kgs

Answer

1

Answer

0.28 kgs

Question 9

A centile divides data into how many equal parts?

Accepted Answer

100 equal parts

Answer

10 equal parts

Answer

5 equal parts

Answer

20 equal parts

Question 10

Which of the following is true regarding statistical significance testing?

Accepted Answer

The probability associated with a Type 1 error is denoted by alpha (α).

Answer

Type 1 error is rejecting the null hypothesis when it is true.

Answer

Type 2 error is accepting the null hypothesis when it is false.

Answer

The significance level (alpha) is typically set to 5%, but can be varied.

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?