Biostatistics Practice Questions

Q: In a normal distribution, what percent of the values will be included in the area between two standard deviations on either side of the mean (x ± 2 SD)?

95.4%. ### Explanation This question tests your knowledge of the **Normal Distribution (Gaussian Curve)**, a fundamental concept in biostatistics used to describe continuous variables in a population (e.g., height, blood pressure, or hemoglobin levels). #### Why C is Correct In a perfectly symmetrical, bell-shaped normal distribution, the area under the curve represents the probability or percentage of the population. The **Empirical Rule** (also known as the 68-95-99.7 rule) defines specific areas covered by standard deviations (SD) from the mean: * **Mean ± 1 SD:** Covers **68.27%** of the values. * **Mean ± 2 SD:** Covers **95.45%** (rounded to 95.4%) of the values. * **Mean ± 3 SD:** Covers **99.73%** of the values. #### Why Other Options are Incorrect * **Option A (68.3%):** This represents the area within **one** standard deviation (Mean ± 1 SD). * **Option B (90.4%):** This is a distractor; it does not correspond to a standard integer SD interval in a normal distribution. * **Option D (99.7%):** This represents the area within **three** standard deviations (Mean ± 3 SD), covering almost the entire population. #### High-Yield Clinical Pearls for NEET-PG 1. **Standard Normal Distribution:** A specific case where the **Mean is 0** and the **Standard Deviation is 1**. 2. **Z-Score:** Indicates how many standard deviations a value is from the mean. For example, a Z-score of +2 corresponds to the 97.7th percentile. 3. **Confidence Intervals (CI):** For a 95% CI (commonly used in research), the value used is actually **1.96 SD**, not exactly 2 SD. However, in general biostatistics questions, 2 SD is often equated to 95.4%. 4. **Properties:** In a normal distribution, **Mean = Median = Mode**. The curve is asymptotic (never touches the base axis).

Q: Standard deviation is a measure of:

Dispersion from the mean value. **Explanation:** **Standard Deviation (SD)** is the most commonly used measure of **dispersion** (variation) in biostatistics. It quantifies how much individual observations in a data set spread out or "deviate" from the arithmetic mean. A small SD indicates that the data points are clustered closely around the mean, while a large SD suggests a wide range of variation. In a Normal Distribution, SD helps define the "Normal Limits" (e.g., Mean ± 2 SD covers approximately 95% of the values). **Analysis of Options:** * **Option B (Central Tendency):** This is incorrect. Measures of central tendency describe the "center" or typical value of a distribution. These include the **Mean, Median, and Mode**. * **Option A (Chance):** This is incorrect. Chance is usually quantified by the **P-value** or probability, which indicates the likelihood that an observed result occurred by random fluke rather than a true effect. * **Option C (Dispersion):** This is correct. Other measures of dispersion include Range, Mean Deviation, and Variance (which is SD squared). **High-Yield Clinical Pearls for NEET-PG:** * **Standard Error (SE):** Do not confuse SD with SE. While SD measures the variation within a single sample, SE measures the variation of the *sample mean* from the true *population mean*. * **Coefficient of Variation (CV):** This is (SD ÷ Mean) × 100. It is used to compare the relative dispersion of two sets of data with different units (e.g., comparing height in cm vs. weight in kg). * **Normal Distribution Rule:** * Mean ± 1 SD = 68.3% of values * Mean ± 2 SD = 95.4% of values * Mean ± 3 SD = 99.7% of values

Q: A study reports patient satisfaction levels as 'Satisfied', 'Very satisfied', and 'Dissatisfied'. Which type of scale best represents these categories?

Ordinal. ### Explanation **Why Ordinal is Correct:** The data presented (Satisfied, Very satisfied, Dissatisfied) represents **Ordinal data**. In biostatistics, an ordinal scale is used when data can be categorized into distinct groups that have a **natural rank or inherent order**, but the mathematical distance between the categories is not uniform or measurable. In this study, "Very satisfied" is clearly higher than "Satisfied," which is higher than "Dissatisfied," but you cannot quantify exactly *how much* more satisfied one patient is compared to another. **Why Other Options are Incorrect:** * **Nominal:** This scale is for naming or labeling categories without any inherent order (e.g., Blood groups A, B, O; Gender; or Color of eyes). Since satisfaction levels have a logical hierarchy, they are not merely nominal. * **Interval:** This scale has a defined order and equal intervals between values, but **no true zero point** (e.g., Temperature in Celsius). Satisfaction levels do not have measurable, equal mathematical intervals. * **Ratio:** This is the highest level of measurement. It has all the properties of an interval scale plus a **true zero point** (e.g., Height, Weight, Blood Pressure). Satisfaction levels cannot be zero in a mathematical sense. **Clinical Pearls & High-Yield Facts for NEET-PG:** * **Mnemonic for Scales (Lowest to Highest Complexity):** **NOIR** (**N**ominal → **O**rdinal → **I**nterval → **R**atio). * **Likert Scales:** Most surveys using "Strongly Agree" to "Strongly Disagree" are classic examples of **Ordinal** data. * **Cancer Staging:** TNM staging or WHO functional grades are **Ordinal** scales. * **Qualitative vs. Quantitative:** Nominal and Ordinal are **Qualitative (Categorical)**, while Interval and Ratio are **Quantitative (Numerical)**. * **Central Tendency:** For Ordinal data, the **Median** is the most appropriate measure of central tendency.

Q: Amount of alcohol consumption amongst a group of alcoholics, before and after intervention, was recorded. What is the most appropriate statistical test to assess 'significant change' resulting from the intervention program?

Paired t-test. ### Explanation The core of this question lies in identifying the type of data and the relationship between the study groups. **1. Why Paired t-test is correct:** * **Type of Data:** "Amount of alcohol consumption" is a **quantitative (numerical/continuous)** variable (e.g., ml/day). * **Study Design:** The measurements are taken from the **same group** of individuals at two different points in time (**Before and After** intervention). These are "dependent" or "paired" observations. * **Purpose:** To compare the means of two related groups to determine if the intervention caused a statistically significant change, the **Paired t-test** is the standard parametric test used. **2. Why other options are incorrect:** * **Unpaired (Independent) t-test:** This is used to compare the means of two **independent** groups (e.g., comparing alcohol intake between Group A and Group B). * **Chi-square test:** This is used for **qualitative (categorical)** data to compare proportions (e.g., comparing the number of "drinkers" vs. "non-drinkers" in two groups). It cannot analyze the "amount" of consumption directly. * **McNemar test:** This is used for **paired qualitative** data. It would be appropriate if we were looking at a "Yes/No" change in addiction status before and after treatment, rather than the specific amount consumed. ### High-Yield Clinical Pearls for NEET-PG: * **Quantitative Data + 2 Groups:** * Paired (Before/After) $\rightarrow$ **Paired t-test** * Unpaired (Group A vs B) $\rightarrow$ **Unpaired t-test** * **Quantitative Data + >2 Groups:** Use **ANOVA**. * **Qualitative Data + 2 Groups:** * Unpaired $\rightarrow$ **Chi-square test** * Paired $\rightarrow$ **McNemar test** * **Non-parametric alternative:** If the data is not normally distributed, the non-parametric alternative to the Paired t-test is the **Wilcoxon Signed-Rank test**.

Q: The Hb level in healthy women has a mean of 13.5 g/dl and a standard deviation of 1.5 g/dl. What is the Z-score for a woman with an Hb level of 15.0 g/dl?

1.0. ### Explanation **Concept:** The **Z-score** (Standard Score) is a fundamental biostatistical tool used to determine how many standard deviations a specific observation is from the mean. It allows us to compare individual data points within a normal distribution. The formula for calculating the Z-score is: $$Z = \frac{X - \mu}{\sigma}$$ *Where:* * **X** = Individual value (15.0 g/dl) * **μ (Mean)** = Average value (13.5 g/dl) * **σ (Standard Deviation)** = 1.5 g/dl **Calculation:** $Z = \frac{15.0 - 13.5}{1.5} = \frac{1.5}{1.5} = \mathbf{1.0}$ A Z-score of +1.0 indicates that the woman's Hb level is exactly **one standard deviation above the mean**. --- ### Analysis of Options: * **Option D (1.0) is Correct:** As calculated above, the difference between the value and the mean equals exactly one unit of standard deviation. * **Option C (2.0) is Incorrect:** This would require an Hb level of 16.5 g/dl ($13.5 + [2 \times 1.5]$). * **Options A (9.0) and B (10.0) are Incorrect:** These represent extreme outliers. In a normal distribution, 99.7% of all values fall within a Z-score of ±3. A Z-score of 10 would be physiologically improbable in this context. --- ### High-Yield Clinical Pearls for NEET-PG: 1. **Normal Distribution (Gaussian Curve):** * Mean ± 1 SD covers **68.3%** of values. * Mean ± 2 SD covers **95.4%** of values. * Mean ± 3 SD covers **99.7%** of values. 2. **Z-score of 0:** This means the individual's value is exactly equal to the mean. 3. **Standard Normal Distribution:** A specific normal distribution where the **Mean is 0** and the **Standard Deviation is 1**. 4. **Application:** Z-scores are clinically used in **WHO Growth Charts** (e.g., Weight-for-height Z-scores) to diagnose malnutrition (Wasting/Stunting).

Q: What is the formula for calculating probability from odds?

Odds / (1 + Odds). ### Explanation In biostatistics, **Probability** and **Odds** are two ways of expressing the likelihood of an event, but they use different denominators. 1. **Probability (P):** The ratio of the number of times an event occurs to the *total* number of trials. It ranges from 0 to 1. * $P = \frac{\text{Events}}{\text{Events} + \text{Non-events}}$ 2. **Odds:** The ratio of the number of times an event occurs to the number of times it *does not* occur. * $\text{Odds} = \frac{P}{1 - P}$ **Why Option B is Correct:** To derive Probability from Odds, we rearrange the formula: $\text{Probability} = \frac{\text{Odds}}{1 + \text{Odds}}$ For example, if the odds of a disease are 1:4 (0.25), the probability is $0.25 / (1 + 0.25) = 0.20$ or 20%. **Analysis of Incorrect Options:** * **Option C [(1 + Odds) / Odds]:** This is the reciprocal of the correct formula and has no standard application in biostatistics. * **Option D [(1 - Odds) / Odds]:** This is a mathematical distortion; it does not represent any standard epidemiological measure. **Clinical Pearls for NEET-PG:** * **Case-Control Studies:** Use **Odds Ratio (OR)** because the total population at risk (denominator for probability) is unknown. * **Cohort Studies:** Use **Relative Risk (RR)**, which is based on probability/incidence. * **Rare Disease Assumption:** When a disease is rare (prevalence <10%), the Odds Ratio becomes a good approximation of the Relative Risk. * **Range:** Probability is always between 0 and 1 (or 0–100%), whereas Odds can range from 0 to infinity.

Q: What is the formula for calculating positive predictive value (PPV)?

a / (a + b). ### Explanation To understand predictive values, we must first construct the standard **2x2 Contingency Table** [1]: | | Disease Present (+) | Disease Absent (-) | Total | | :--- | :---: | :---: | :---: | | **Test Positive (+)** | **a** (True Positive) | **b** (False Positive) | **a + b** | | **Test Negative (-)** | **c** (False Negative) | **d** (True Negative) | **c + d** | | **Total** | **a + c** | **b + d** | | #### Why Option A is Correct **Positive Predictive Value (PPV)** measures the probability that a patient actually has the disease given that the test result is positive [2]. It is calculated by dividing the number of True Positives (**a**) by the total number of people who tested positive (**a + b**). * **Formula:** $PPV = \frac{a}{a + b}$ #### Analysis of Incorrect Options * **Option B [d / (c + d)]:** This is the formula for **Negative Predictive Value (NPV)** [1]. It represents the probability that a patient is truly healthy given a negative test result. * **Option C [a / (a + c)]:** This is the formula for **Sensitivity**. it measures the ability of a test to correctly identify those with the disease (True Positive Rate). * **Option D [d / (b + d)]:** This is the formula for **Specificity**. It measures the ability of a test to correctly identify those without the disease (True Negative Rate). #### NEET-PG High-Yield Pearls 1. **Prevalence Dependency:** Unlike Sensitivity and Specificity (which are inherent properties of the test), **Predictive Values depend on the prevalence** of the disease in the population [2]. 2. **The Relationship:** [2] * If Prevalence **increases** $\rightarrow$ PPV **increases** and NPV **decreases**. * If Prevalence **decreases** $\rightarrow$ PPV **decreases** and NPV **increases**. 3. **Clinical Utility:** PPV is the most useful measure for a clinician when communicating a diagnosis to a patient after receiving a positive lab report.

Question 1

What is a frequency curve?

Accepted Answer

When the number of observations is large and the group interval is reduced, a frequency polygon loses its angulations and becomes a curve.

Answer

It is a frequency polygon presenting variation by a line, showing the trend of an event over a period of time.

Answer

It is a graph of cumulative frequency distribution.

Answer

It is an area diagram of frequency distribution developed over a histogram.

Question 2

In a normal distribution, what percent of the values will be included in the area between two standard deviations on either side of the mean (x ± 2 SD)?

Accepted Answer

95.4%

Answer

68.3%

Answer

90.4%

Answer

99.7%

Question 3

Standard deviation is a measure of:

Accepted Answer

Dispersion from the mean value

Answer

Chance

Answer

Central tendency

Answer

None of the above

Question 4

What is the denominator in the general fertility rate?

Accepted Answer

All women between 15-45 years of age

Answer

All married women between 15-45 years of age

Answer

Total number of live births

Answer

Total number of all births

Question 5

Under the Registration Act of 1969, within how many days must a death be registered?

Accepted Answer

14 days after the event

Answer

7 days after the event

Answer

20 days after the event

Answer

21 days after the event

Question 6

A study reports patient satisfaction levels as 'Satisfied', 'Very satisfied', and 'Dissatisfied'. Which type of scale best represents these categories?

Accepted Answer

Ordinal

Answer

Nominal

Answer

Interval

Answer

Ratio

Question 7

Amount of alcohol consumption amongst a group of alcoholics, before and after intervention, was recorded. What is the most appropriate statistical test to assess 'significant change' resulting from the intervention program?

Accepted Answer

Paired t-test

Answer

Unpaired t-test

Answer

Chi-square test

Answer

McNemar test

Question 8

The Hb level in healthy women has a mean of 13.5 g/dl and a standard deviation of 1.5 g/dl. What is the Z-score for a woman with an Hb level of 15.0 g/dl?

Accepted Answer

1.0

Answer

9.0

Answer

10.0

Answer

2.0

Question 9

What is the formula for calculating probability from odds?

Accepted Answer

Odds / (1 + Odds)

Answer

(1 + Odds) / Odds

Answer

(1 - Odds) / Odds

Question 10

What is the formula for calculating positive predictive value (PPV)?

Accepted Answer

a / (a + b)

Answer

d / (c + d)

Answer

a / (a + c)

Answer

d / (b + d)

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?