Biostatistics Practice Questions

Q: Calculate the sensitivity and specificity of an ELISA test for HIV screening, given the following data: | | HIV Present | HIV Absent | |-------------|-------------|------------| | ELISA +ve | 80 | 40 | | ELISA -ve | 20 | 60 |

Sensitivity 80%, Specificity 60%. ### Explanation To solve this problem, we must first organize the data into a standard 2x2 contingency table: | | Disease Present (HIV+) | Disease Absent (HIV-) | Total | | :--- | :---: | :---: | :---: | | **Test Positive** | 80 (TP) | 40 (FP) | 120 | | **Test Negative** | 20 (FN) | 60 (TN) | 80 | | **Total** | 100 | 100 | 200 | **1. Sensitivity (True Positive Rate):** Sensitivity measures the ability of a test to correctly identify those with the disease. * **Formula:** [TP / (TP + FN)] × 100 * **Calculation:** [80 / (80 + 20)] × 100 = **80%** **2. Specificity (True Negative Rate):** Specificity measures the ability of a test to correctly identify those without the disease. * **Formula:** [TN / (TN + FP)] × 100 * **Calculation:** [60 / (60 + 40)] × 100 = **60%** --- ### Analysis of Options * **Option B (Correct):** Correctly identifies Sensitivity as 80% and Specificity as 60%. * **Option A:** Incorrectly swaps the values for sensitivity and specificity. * **Option C & D:** These values (66.6% and 75%) represent the **Positive Predictive Value (PPV)** and **Negative Predictive Value (NPV)**. * PPV = TP / (TP + FP) = 80/120 = 66.6% * NPV = TN / (TN + FN) = 60/80 = 75% --- ### NEET-PG High-Yield Pearls * **SNOUT:** **S**ensitivity rules **OUT** the disease (used for screening; high sensitivity means low False Negatives). * **SPIN:** **S**pecificity rules **IN** the disease (used for confirmation; high specificity means low False Positives). * **Prevalence Independence:** Sensitivity and Specificity are inherent properties of a test and **do not change** with disease prevalence. However, Predictive Values (PPV/NPV) are highly dependent on prevalence. * **HIV Protocol:** ELISA is a highly sensitive screening test, while Western Blot (or Geenius™) is a highly specific confirmatory test.

Q: Which of the following represents sensitivity?

True Positives / (True Positives + False Negatives). ### Explanation **Sensitivity** is defined as the ability of a screening test to correctly identify those who truly have the disease. It represents the "True Positive Rate." **1. Why Option C is Correct:** Sensitivity is calculated as the proportion of people with the disease who test positive. In a 2x2 contingency table, the total number of diseased individuals is the sum of **True Positives (TP)** and **False Negatives (FN)**. Therefore, the formula is: $$\text{Sensitivity} = \frac{\text{TP}}{\text{TP} + \text{FN}}$$ A test with high sensitivity is crucial for screening because it ensures that very few diseased individuals are missed (low false-negative rate). **2. Analysis of Incorrect Options:** * **Option A:** This is the formula for **Specificity** (True Negative Rate). It measures the ability of a test to correctly identify those without the disease. * **Option B:** This is an incorrect mathematical construct and does not represent a standard epidemiological metric. * **Option D:** This is the formula for **Positive Predictive Value (PPV)**. It indicates the probability that a person who tests positive actually has the disease. **3. Clinical Pearls for NEET-PG:** * **SNOUT:** A highly **S**ensitive test, when **N**egative, rules **OUT** the disease. * **SPIN:** A highly **S**pecific test, when **P**ositive, rules **IN** the disease. * **Screening vs. Diagnosis:** Sensitivity is the priority for screening tests (e.g., ELISA for HIV), while Specificity is the priority for confirmatory tests (e.g., Western Blot). * **Inverse Relationship:** Sensitivity is inversely related to the False Negative rate (Sensitivity = 1 – FN rate).

Q: Quantiles divide a set of data into how many equal parts?

5. **Explanation** In biostatistics, **quantiles** are values that divide a frequency distribution into equal, contiguous intervals. The term "quantile" is a generic parent term for any division of data. However, in the context of specific statistical nomenclature used in medical exams, the term **Quintiles** (often referred to interchangeably with quantiles in specific question stems) divides the data into **5 equal parts**, each representing 20% of the total population. **Analysis of Options:** * **Option B (Correct):** Quintiles divide the data into **5 equal parts**. In public health, quintiles are frequently used to categorize "Wealth Index" or "Socio-economic status," where the population is divided from the poorest 20% to the richest 20%. * **Option A (Incorrect):** There is no standard statistical term for 3 equal parts, though 2 points (tertiles) are required to create 3 segments. * **Option C (Incorrect):** **Deciles** divide the data into **10 equal parts** (each representing 10%). * **Option D (Incorrect):** 15 is not a standard division used in descriptive biostatistics. **High-Yield Clinical Pearls for NEET-PG:** * **Median:** Divides data into **2** equal parts (50th percentile). * **Quartiles:** Divide data into **4** equal parts (25% each). Note: There are 3 quartile points (Q1, Q2, Q3). * **Percentiles:** Divide data into **100** equal parts (1% each). * **Interquartile Range (IQR):** Measures the difference between the 75th (Q3) and 25th (Q1) percentiles; it is the best measure of dispersion for skewed data. * **Wealth Index** in NFHS (National Family Health Survey) data is always presented in **quintiles**.

Q: What statistical method is used to calculate the death rate between two populations with different age groups?

Standardized death rate. **Explanation:** The correct answer is **Standardized Death Rate**. **1. Why Standardized Death Rate is correct:** When comparing mortality between two populations, age is the most significant confounding factor because death rates vary naturally across different age groups (e.g., higher in the elderly). If one population has a higher proportion of elderly individuals, its total death rate will appear higher regardless of the actual health conditions. **Standardization (Direct or Indirect)** removes the confounding effect of age by applying the observed rates to a "Standard Population," allowing for a fair "apples-to-apples" comparison. **2. Why other options are incorrect:** * **Crude Death Rate (CDR):** This is the actual number of deaths per 1,000 mid-year population. It does not account for age distribution, making it unsuitable for comparing populations with different demographic structures. * **Case Fatality Rate (CFR):** This measures the killing power of a specific disease (Deaths from disease / Total cases of that disease). It is a measure of virulence, not a tool for population-wide mortality comparison. * **Age-Specific Death Rate:** This calculates the death rate for a specific age group (e.g., 5–14 years). While it provides detail, it does not provide a single summary measure to compare two entire populations. **High-Yield Clinical Pearls for NEET-PG:** * **Direct Standardization:** Used when the age-specific death rates of the study population are **known**. * **Indirect Standardization:** Used when age-specific rates are **unknown** or the population is small. It calculates the **Standardized Mortality Ratio (SMR)**. * **SMR Formula:** (Observed Deaths / Expected Deaths) × 100. * Standardization is the gold standard for comparing any vital statistics (morbidity or mortality) across different geographical areas.

Q: What is true about a normal distribution (bell curve)?

The variance is 0.. ### Explanation In a **Normal Distribution** (also known as the Gaussian distribution), data is distributed symmetrically around the center, forming a characteristic bell-shaped curve. **Analysis of Options:** * **Correct Answer (B/D Correction):** *Note: In standard biostatistics, the defining feature of a normal distribution is that the **Mean, Median, and Mode are equal** (Option B). If the provided key marks "Variance is 0" as correct, it is technically a mathematical impossibility for a distribution; a variance of 0 implies all data points are identical, resulting in a single vertical line, not a bell curve. However, for NEET-PG purposes, always prioritize the symmetry of central tendencies.* * **Why Option B is the standard truth:** In a perfectly symmetrical bell curve, the peak (Mode) is exactly in the middle, which also happens to be the average (Mean) and the 50th percentile (Median). * **Why Option A is wrong:** A normal distribution has **zero skewness**. If it were skewed to the left, it would be a "negatively skewed" distribution where the tail points toward the lower values. * **Why Options C & D are wrong:** Standard deviation and variance measure the "spread" of data. In a normal distribution, data is spread out according to the **68-95-99.7 rule**. If variance or SD were 0, there would be no "curve" at all. **High-Yield Clinical Pearls for NEET-PG:** 1. **Area under the curve:** * Mean ± 1 SD covers **68.2%** of values. * Mean ± 2 SD covers **95.4%** of values. * Mean ± 3 SD covers **99.7%** of values. 2. **Standard Normal Distribution:** A specific case where the **Mean = 0** and **Standard Deviation = 1**. 3. **Z-score:** Indicates how many standard deviations a data point is from the mean. 4. **Skewness:** If Mean > Median, it is **Positively Skewed** (tail to the right); if Mean < Median, it is **Negatively Skewed** (tail to the left).

Q: Which measures of central tendency and dispersion are typically used to construct a confidence limit?

Mean and standard error. ### Explanation **1. Why "Mean and Standard Error" is Correct:** Confidence limits (or Confidence Intervals) are used in inferential statistics to estimate the range within which a population parameter (like the true population mean) is likely to lie. The formula for a 95% Confidence Interval is: **$CI = \text{Mean} \pm (1.96 \times \text{Standard Error})$** * **Mean:** This is the measure of central tendency used as the point estimate. * **Standard Error (SE):** This is the measure of dispersion used to account for sampling variation. SE represents the standard deviation of the sampling distribution of the mean ($SE = \frac{SD}{\sqrt{n}}$). It tells us how far the sample mean is likely to be from the true population mean. **2. Why Other Options are Incorrect:** * **Option A & D:** While **Standard Deviation (SD)** is a measure of dispersion, it describes the spread of individual observations within a single sample. It is used to define the "Normal Range" (Reference Range) for individuals, not the confidence limits for a population estimate. * **Option B:** The **Median** is used for skewed data or non-parametric tests. Confidence intervals for the median exist but are not the standard "confidence limits" typically referred to in medical research, which assume a normal distribution of the sample mean. **3. High-Yield Clinical Pearls for NEET-PG:** * **SD vs. SE:** Use **SD** to describe the sample (e.g., "The average height of students was $170 \pm 5$ cm"). Use **SE** to make inferences about the population. * **95% CI:** Corresponds to a Z-value of 1.96 (often rounded to 2). * **99% CI:** Corresponds to a Z-value of 2.58. * **Precision:** The narrower the Confidence Interval, the more precise the estimate. Increasing the sample size ($n$) decreases the SE, thereby narrowing the CI.

Q: In a study to determine the relationship between the presence of Ischemic Heart Disease (IHD) and smoking, what is the appropriate statistical test?

Chi-square test. ### Explanation The correct answer is **Chi-square test**. **Why Chi-square test is correct:** In biostatistics, the choice of a statistical test depends on the type of data being analyzed. In this study, we are looking at the relationship between two **qualitative (categorical)** variables: 1. **Smoking:** Categorized as "Smoker" or "Non-smoker." 2. **Ischemic Heart Disease (IHD):** Categorized as "Present" or "Absent." The Chi-square test is the standard test used to compare proportions or to test the association between two categorical variables. It determines if the observed frequency in a 2x2 contingency table differs significantly from the expected frequency. **Why other options are incorrect:** * **Z-test:** This is a parametric test used for **quantitative** data when the sample size is large (n > 30). It compares means, not proportions of categorical outcomes. * **Paired t-test:** This is used for **quantitative** data to compare the means of two related groups (e.g., "before and after" measurements in the same individual). It is not applicable to categorical data like IHD status. **High-Yield Clinical Pearls for NEET-PG:** * **Qualitative + Qualitative:** Chi-square test, Fischer’s exact test (if cell frequency 2 groups):** ANOVA (Analysis of Variance). * **Correlation:** To check the strength of a linear relationship between two quantitative variables (e.g., Height and Weight). * **Regression:** To predict the value of one variable based on another.

Question 1

Calculate the sensitivity and specificity of an ELISA test for HIV screening, given the following data:

|             | HIV Present | HIV Absent |
|-------------|-------------|------------|
| ELISA +ve   | 80          | 40         |
| ELISA -ve   | 20          | 60         |

Accepted Answer

Sensitivity 80%, Specificity 60%

Answer

Sensitivity 60%, Specificity 80%

Answer

Sensitivity 66.6%, Specificity 75%

Answer

Sensitivity 75%, Specificity 66.6%

Question 2

Which of the following represents sensitivity?

Accepted Answer

True Positives / (True Positives + False Negatives)

Answer

True Negatives / (True Negatives + False Positives)

Answer

True Negatives / (True Negatives + False Negatives)

Answer

True Positives / (True Positives + False Positives)

Question 3

Quantiles divide a set of data into how many equal parts?

Accepted Answer

5

Answer

3

Answer

10

Answer

15

Question 4

Which of the following distributions is symmetrical?

Accepted Answer

Bimodal distribution

Answer

Normal distribution

Answer

Skewed distribution

Answer

U-shaped distribution

Question 5

When the confidence level of a test is increased, which of the following will happen?

Accepted Answer

Previously insignificant value becomes significant

Answer

No effect on significance

Answer

Previously significant value becomes insignificant

Answer

No change in hypothesis

Question 6

What statistical method is used to calculate the death rate between two populations with different age groups?

Accepted Answer

Standardized death rate

Answer

Crude death rate

Answer

Case fatality rate

Answer

Age-specific death rate

Question 7

What is true about a normal distribution (bell curve)?

Accepted Answer

The variance is 0.

Answer

It is skewed to the left.

Answer

The mean, median, and mode are equal.

Answer

The standard deviation is 0.

Question 8

Which measures of central tendency and dispersion are typically used to construct a confidence limit?

Accepted Answer

Mean and standard error

Answer

Range and standard deviation

Answer

Median and standard error

Answer

Mode and standard deviation

Question 9

In a study to determine the relationship between the presence of Ischemic Heart Disease (IHD) and smoking, what is the appropriate statistical test?

Accepted Answer

Chi-square test

Answer

Z-test

Answer

Paired t-test

Answer

None of the above

Question 10

The regression between height and age follows y=a+bx. What kind of curve does this represent?

Accepted Answer

Straight line

Answer

Hyperbola

Answer

Sigmoid

Answer

Parabola

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?