Biostatistics Practice Questions

Q: Disability-adjusted life expectancy (DALE) has been replaced by which of the following metrics?

HALE. **Explanation:** The correct answer is **HALE (Health-Adjusted Life Expectancy)**. **Why HALE is correct:** In the World Health Report 2000, the World Health Organization (WHO) introduced **Disability-Adjusted Life Expectancy (DALE)** to measure the average number of years a person can expect to live in "full health." In 2001, the WHO officially renamed DALE to **HALE**. HALE is a summary measure of population health that subtracts the years of ill-health (weighted by severity) from the overall life expectancy. It provides a more accurate picture of a population's health status than mortality rates alone. **Why the other options are incorrect:** * **DALY (Disability-Adjusted Life Year):** This is a measure of the **burden of disease**. One DALY represents the loss of one year of "healthy" life. It is the sum of Years of Life Lost (YLL) due to premature mortality and Years Lived with Disability (YLD). * **QALY (Quality-Adjusted Life Year):** Primarily used in **health economics** to assess the value of medical interventions. It combines both the quantity and the quality of life generated by a specific treatment. * **DFLE (Disability-Free Life Expectancy):** Also known as "Sullivan’s Index." It calculates the expectation of life free of disability. While related, it is a simpler binary measure (disabled vs. not disabled) compared to the severity-weighted HALE. **High-Yield Clinical Pearls for NEET-PG:** * **Sullivan’s Index (DFLE):** Considered one of the best indicators of a population's health status. * **HALE Formula:** Life Expectancy – (Years lived with disability × disability weight). * **DALY:** The most common metric used to quantify the Global Burden of Disease (GBD). * **PQLI (Physical Quality of Life Index):** Includes Infant Mortality, Life Expectancy at Age 1, and Literacy (Scale 0-100). It does **not** include income.

Q: An experimental diagnostic test is developed to noninvasively detect the presence of trisomy 21, Down's syndrome. The test is administered to a group of 500 women considered to be at risk for a Down's fetus based on blood tests. The results are presented in the table below. What is the sensitivity of this new test? Trisomy 21: Positive Test = 100, Negative Test = 100 Normal Karyotype: Positive Test = 50, Negative Test = 250

50%. ### Explanation To calculate the sensitivity of a diagnostic test, we must first organize the data into a standard **2x2 Contingency Table**: | | Trisomy 21 (Disease +) | Normal (Disease -) | Total | | :--- | :---: | :---: | :---: | | **Test Positive** | 100 (TP) | 50 (FP) | 150 | | **Test Negative** | 100 (FN) | 250 (TN) | 350 | | **Total** | **200** | **300** | **500** | **1. Why the Correct Answer (B) is 50%:** **Sensitivity** is the ability of a test to correctly identify those with the disease (True Positive Rate). * **Formula:** [TP / (TP + FN)] × 100 * **Calculation:** [100 / (100 + 100)] × 100 = [100 / 200] × 100 = **50%**. This means the test only identifies half of the actual Down's syndrome cases. **2. Analysis of Incorrect Options:** * **Option A (40%):** This is an incorrect calculation, likely derived from dividing TP by TN (100/250). * **Option C (67%):** This represents the **Positive Predictive Value (PPV)**. Formula: [TP / (TP + FP)] = 100/150 = 66.6%. * **Option D (71%):** This represents the **Negative Predictive Value (NPV)**. Formula: [TN / (TN + FN)] = 250/350 = 71.4%. **3. Clinical Pearls & High-Yield Facts for NEET-PG:** * **Sensitivity (SNNP):** A highly **S**ensitive test, when **N**egative, helps rule **O**ut the disease. It is ideal for **screening** tests. * **Specificity (SPIN):** A highly **Sp**ecific test, when **P**ositive, helps rule **I**n the disease. It is ideal for **confirmatory** tests. * **Specificity in this case:** [TN / (TN + FP)] = 250/300 = 83.3%. * **Prevalence:** In this study group, prevalence is (Total Disease+ / Total Population) = 200/500 = 40%. Note that PPV and NPV are dependent on disease prevalence, whereas Sensitivity and Specificity are inherent properties of the test.

Q: Calculate the Infant Mortality Rate (IMR) if in a population of 100,000 there are 3,000 live births in a year and 150 infant deaths in the same year?

50. ### Explanation **1. Why the Correct Answer (D) is Right:** The **Infant Mortality Rate (IMR)** is defined as the number of deaths of children under one year of age per 1,000 live births in a given year. It is a sensitive indicator of the overall health status of a community and the effectiveness of maternal and child health services. **Formula:** $$IMR = \frac{\text{Number of deaths under 1 year of age in a year}}{\text{Total number of live births in the same year}} \times 1,000$$ **Calculation:** * Number of infant deaths = 150 * Total live births = 3,000 * $IMR = (150 / 3,000) \times 1,000$ * $IMR = 0.05 \times 1,000 = \mathbf{50}$ **2. Why the Incorrect Options are Wrong:** * **Option A (75):** This value is mathematically incorrect based on the provided data. * **Option B (18):** This might be confused with the Crude Death Rate if calculated against the total population $(150/100,000 \times 1,000 = 1.5)$, but it does not fit the IMR criteria. * **Option C (5):** This is a result of a decimal error (calculating per 100 instead of per 1,000). **3. High-Yield Clinical Pearls for NEET-PG:** * **Denominator Alert:** The denominator for IMR is **Live Births**, not the mid-year population. This is a common trap in biostatistics questions. * **Neonatal vs. Post-Neonatal:** * *Neonatal Mortality:* Deaths within 28 days of birth. * *Post-Neonatal Mortality:* Deaths from 28 days to under 1 year (influenced more by environmental factors like malnutrition and infections). * **Most Common Cause of IMR in India:** Low Birth Weight (LBW) and Prematurity, followed by Pneumonia and Diarrheal diseases. * **Current Trend:** Always keep track of the latest SRS (Sample Registration System) data for India's current IMR (currently hovering around 28 per 1,000 live births).

Q: In a statistical study to calculate the effect of a drug on a patient's sugar level, a test showed a significant difference when in reality there was no difference. What is this type of error called?

Alpha error. ### Explanation This question tests the fundamental understanding of **Hypothesis Testing** in Biostatistics, a high-yield area for NEET-PG. **1. Why Alpha Error is Correct:** An **Alpha ($\alpha$) error**, also known as a **Type I error**, occurs when a researcher rejects the Null Hypothesis ($H_0$) even though it is actually true. In clinical terms, this is a **"False Positive"** result. In this scenario, the test showed a "significant difference" (rejected the null) when in reality there was "no difference" (null was true). It is essentially "finding a difference where none exists." **2. Analysis of Incorrect Options:** * **Beta ($\beta$) error (Type II error):** This occurs when the researcher fails to reject a false Null Hypothesis. It is a **"False Negative"**—concluding there is no difference when one actually exists ("missing a real difference"). * **Gamma error:** This is not a standard term used in basic hypothesis testing for medical statistics. * **Power of a test ($1-\beta$):** This is the probability that a test will correctly identify a significant difference if one truly exists. It is the ability of a study to avoid a Type II error. **3. Clinical Pearls & High-Yield Facts:** * **P-value:** This is the probability of committing a Type I error. Usually, a p-value < 0.05 is considered statistically significant. * **Confidence Interval (CI):** $1 - \alpha$. If $\alpha$ is 0.05 (5%), the Confidence Level is 95%. * **Memory Aid:** * **Type I (Alpha):** **I**nnocent person convicted (False Positive). * **Type II (Beta):** **B**ad person set free (False Negative). * **Relationship:** Decreasing the risk of a Type I error usually increases the risk of a Type II error unless the sample size is increased.

Q: Which of the following best describes a normal curve?

The distribution of data is symmetrical.. In Biostatistics, the **Normal Distribution** (also known as the Gaussian distribution) is a fundamental concept representing how continuous variables are distributed in a population. ### 1. Why Option A is Correct A normal curve is characterized by its **perfect symmetry** around the center. In a perfectly normal distribution: * The curve is bell-shaped. * The **Mean, Median, and Mode are all equal** and coincide at the peak of the curve. * The total area under the curve is 1 (or 100%), with exactly 50% of observations lying on either side of the center. ### 2. Why Other Options are Incorrect Options B, C, and D describe **Skewed Distributions**, where the symmetry is lost: * **Option B (Mean > Mode):** This describes a **Positively Skewed** (Right-skewed) distribution. The tail extends towards the right (higher values), pulling the mean away from the peak. * **Options C & D (Mode/Median > Mean):** These describe a **Negatively Skewed** (Left-skewed) distribution. The tail extends towards the left (lower values), pulling the mean down below the median and mode. ### 3. NEET-PG High-Yield Pearls * **Standard Normal Curve:** A specific normal curve where the **Mean is 0** and the **Standard Deviation (SD) is 1**. * **68-95-99.7 Rule (Empirical Rule):** * Mean ± 1 SD covers **68.3%** of values. * Mean ± 2 SD covers **95.4%** of values. * Mean ± 3 SD covers **99.7%** of values. * **Clinical Application:** Most biological parameters (e.g., height, blood pressure, IQ) follow a normal distribution in a healthy population. If a distribution is highly skewed, the **Median** is considered a better measure of central tendency than the Mean.

Q: Calculate the range from the following frequency distribution: | Class Interval | Frequency | |----------------|-----------| | 10 – 15 | 3 | | 15 – 20 | 7 | | 20 – 25 | 5 | Total n = 15

15. ***15*** - **Range** is calculated as the difference between the **maximum value** and the **minimum value** in the dataset. - From the frequency distribution, the **lowest class boundary is 10** and the **highest class boundary is 25**, so Range = 25 - 10 = **15**. *10* - This represents the **lower boundary** of the first class interval, not the range of the distribution. - Range requires calculating the **difference between extremes**, not just identifying the minimum value. *20* - This is the **upper boundary** of the second class interval, which is neither the maximum nor the range. - It does not represent the **spread** or **variability** of the entire dataset. *25* - This is the **upper boundary** of the highest class interval, representing the maximum value but not the range. - Range is the **difference between maximum and minimum**, not just the maximum value alone.

Q: An investigator finds that 5 independent factors influence the occurrence of a disease. Comparison of multiple factors that are responsible for the disease can be assessed by:

Multiple logistic regression. ### Explanation The core of this question lies in identifying the relationship between multiple independent variables (risk factors) and a single dependent variable (disease occurrence). **1. Why Multiple Logistic Regression is Correct:** In medical research, the "occurrence of a disease" is typically a **dichotomous (binary) outcome**—meaning the patient either has the disease or does not (Yes/No). When you need to assess the influence of multiple independent factors (which can be continuous or categorical) on a single binary outcome, **Multiple Logistic Regression** is the statistical tool of choice. It calculates the **Odds Ratio (OR)** for each factor while controlling for confounders. **2. Why the Other Options are Incorrect:** * **ANOVA (Analysis of Variance):** Used to compare the **means** of a continuous variable across three or more categorical groups (e.g., comparing mean blood pressure across three different diet groups). * **Multiple Linear Regression:** Used when the dependent variable is **continuous** (e.g., predicting exact blood sugar levels based on age, weight, and exercise). It is not used for binary "yes/no" outcomes. * **Chi-square Test:** Used to find an association between two **categorical** variables (e.g., smoking and lung cancer). It cannot handle multiple independent factors simultaneously in its basic form. **3. High-Yield Clinical Pearls for NEET-PG:** * **Logistic Regression = Dichotomous Outcome** (Disease vs. No Disease). It yields **Odds Ratio**. * **Linear Regression = Continuous Outcome** (Height, Weight, BP). It yields a **Correlation Coefficient (r)**. * **ANOVA** = Comparison of **Means** (3+ groups). * **Paired t-test** = Comparison of means in the **same group** (Before vs. After treatment). * **Unpaired t-test** = Comparison of means between **two different groups**.

Q: A 95% confidence interval for the prevalence of cancer in smokers aged over 65 years is 56% to 76%. What is the chance that the true prevalence is less than 56%?

2.50%. ### Explanation **1. Understanding the Correct Answer (C: 2.50%)** A **95% Confidence Interval (CI)** represents the range within which we are 95% certain the true population parameter (prevalence) lies. This means there is a **5% total probability** that the true value falls *outside* this range. In a normal distribution (bell curve), this 5% error is distributed equally into two "tails": * **Lower Tail:** 2.5% chance the true value is *less than* the lower limit (56%). * **Upper Tail:** 2.5% chance the true value is *greater than* the upper limit (76%). Therefore, the probability that the true prevalence is less than 56% is exactly **2.5%**. **2. Why Other Options are Incorrect** * **A (Nil):** Incorrect. A confidence interval does not provide absolute certainty; there is always a calculated risk of error (alpha). * **B (44%):** Incorrect. This is simply the complement of the lower limit (100% - 56%), which has no statistical relevance to the probability of the true mean. * **D (5%):** Incorrect. This represents the *total* probability of the true value being outside the interval (both tails combined). The question specifically asks for the probability of being *less than* the lower limit (one tail). **3. High-Yield Clinical Pearls for NEET-PG** * **Confidence Interval (CI) Formula:** $Mean \pm (1.96 \times SE)$ for 95% CI; $Mean \pm (2.58 \times SE)$ for 99% CI. * **Precision vs. Sample Size:** A larger sample size results in a narrower (more precise) confidence interval. * **P-value vs. CI:** If a 95% CI for a difference between two groups includes **zero**, the results are not statistically significant ($p > 0.05$). If a 95% CI for an Odds Ratio or Relative Risk includes **one**, it is not significant. * **Interpretation:** A 95% CI means if the study were repeated 100 times, the true value would fall within the calculated interval in 95 of those instances.

Q: In a community of 6000 people, there are 150 cases of TB and 30 deaths due to TB. What is the TB-specific death rate per 1000 population?

0-5. ### Explanation **1. Understanding the Correct Answer (D)** The **Specific Death Rate** measures the number of deaths due to a specific cause per 1,000 population in a given year. The formula is: $$\text{Specific Death Rate} = \frac{\text{Number of deaths from a specific disease}}{\text{Total mid-year population}} \times 1000$$ **Calculation:** * Total Population = 6,000 * Deaths due to TB = 30 * Calculation: $(30 / 6,000) \times 1,000 = 5$ per 1,000 population. Since the result is exactly 5, it falls within the range of **Option D (0-5)**. **2. Why Other Options are Incorrect** * **Option A (20):** This value is obtained if you calculate the **Case Fatality Rate (CFR)**. CFR is the percentage of people diagnosed with a disease who die from it: $(30 / 150) \times 100 = 20\%$. While 20 is a relevant number, it represents lethality, not the population death rate. * **Option B (10):** This is a distractor resulting from calculation errors (e.g., using 3,000 as the denominator). * **Option C (5):** While the numerical value is 5, in many competitive exams, if a range is provided that includes the exact value (0-5), it is selected as the most appropriate category. **3. NEET-PG High-Yield Pearls** * **Case Fatality Rate (CFR):** Reflects the **virulence** or killing power of a disease. It is a ratio, not a true rate (expressed as a percentage). * **Cause-Specific Death Rate:** Reflects the **burden** of a disease on the total community. * **Proportional Mortality Rate:** (Deaths from TB / Total deaths from all causes) × 100. It indicates the relative importance of a specific cause of death. * **Prevalence of TB in this scenario:** $(150 / 6,000) \times 100 = 2.5\%$.

Question 1

Chi-square test is used for which of the following comparisons?

Accepted Answer

Comparing percentages, proportions, and fractions in two or more different groups of individuals

Answer

Comparing percentages, proportions, and fractions in paired data

Answer

Comparing percentages, proportions, and fractions in matched paired data

Answer

Comparing percentages, proportions, and fractions in two unpaired samples

Question 2

Disability-adjusted life expectancy (DALE) has been replaced by which of the following metrics?

Accepted Answer

HALE

Answer

DALY

Answer

QALY

Answer

DFLE

Question 3

An experimental diagnostic test is developed to noninvasively detect the presence of trisomy 21, Down's syndrome. The test is administered to a group of 500 women considered to be at risk for a Down's fetus based on blood tests. The results are presented in the table below. What is the sensitivity of this new test?

Trisomy 21: Positive Test = 100, Negative Test = 100
Normal Karyotype: Positive Test = 50, Negative Test = 250

Accepted Answer

50%

Answer

40%

Answer

67%

Answer

71%

Question 4

Calculate the Infant Mortality Rate (IMR) if in a population of 100,000 there are 3,000 live births in a year and 150 infant deaths in the same year?

Accepted Answer

50

Answer

75

Answer

18

Answer

5

Question 5

In a statistical study to calculate the effect of a drug on a patient's sugar level, a test showed a significant difference when in reality there was no difference. What is this type of error called?

Accepted Answer

Alpha error

Answer

Beta error

Answer

Gamma error

Answer

Power of a test

Question 6

Which of the following best describes a normal curve?

Accepted Answer

The distribution of data is symmetrical.

Answer

The mean is greater than the mode.

Answer

The mode is greater than the mean.

Answer

The median is greater than the mean.

Question 7

Calculate the range from the following frequency distribution:

| Class Interval | Frequency |
|----------------|-----------|
| 10 – 15        | 3         |
| 15 – 20        | 7         |
| 20 – 25        | 5         |

Total n = 15

Accepted Answer

15

Answer

10

Answer

20

Answer

25

Question 8

An investigator finds that 5 independent factors influence the occurrence of a disease. Comparison of multiple factors that are responsible for the disease can be assessed by:

Accepted Answer

Multiple logistic regression

Answer

ANOVA

Answer

Multiple linear regression

Answer

Chi-square test

Question 9

A 95% confidence interval for the prevalence of cancer in smokers aged over 65 years is 56% to 76%. What is the chance that the true prevalence is less than 56%?

Accepted Answer

2.50%

Answer

Nil

Answer

44%

Answer

5%

Question 10

In a community of 6000 people, there are 150 cases of TB and 30 deaths due to TB. What is the TB-specific death rate per 1000 population?

Accepted Answer

0-5

Answer

20

Answer

10

Answer

5

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?