Biostatistics Practice Questions

Q: All of the following are true of the standardized mortality ratio except:

It is expressed as a rate per year.. ### Explanation **Standardized Mortality Ratio (SMR)** is a tool used in indirect standardization to compare the mortality experience of a specific group (e.g., an occupational cohort) with that of a general population. **Why Option A is the Correct Answer (The False Statement):** SMR is a **ratio**, not a rate. It is mathematically expressed as: $$\text{SMR} = \frac{\text{Observed Deaths}}{\text{Expected Deaths}} \times 100$$ Because it is a ratio of two counts (observed vs. expected), it does not have a time dimension like "per year" or a multiplier like "per 1,000" in the way a crude death rate does. It is typically expressed as a percentage. **Analysis of Other Options:** * **Option B (Adjusted for age):** SMR is the primary method of **indirect standardization**, specifically used to account for age distribution differences when age-specific death rates for the study population are unknown. * **Option C (Used for other events):** While "Mortality" is in the name, the mathematical principle can be applied to other events like morbidity, hospitalizations, or complications (Standardized Incidence Ratio). * **Option D (Observed/Expected):** This is the fundamental definition of SMR. "Expected deaths" are calculated by applying the age-specific death rates of a standard population to the age structure of the study population. ### High-Yield Pearls for NEET-PG: * **Interpretation:** An SMR of 100 means observed deaths equal expected deaths. An SMR of 150 means mortality is 50% higher than expected. * **Direct vs. Indirect:** Use **Direct Standardization** when age-specific death rates of the study population are known. Use **Indirect (SMR)** when they are unknown or the study population is small. * **Key Utility:** SMR is frequently used in **occupational health** to study the "Healthy Worker Effect."

Q: What is the probability that a confounding factor falls to the right of the 95% confidence interval?

1 in 40. ### Explanation **1. Why Option D (1 in 40) is Correct** A 95% Confidence Interval (CI) represents the range within which we are 95% certain the true population value lies. This leaves a total error probability (alpha) of **5% (or 1 in 20)** that the value falls outside this range. In a standard normal distribution, this 5% error is **split equally** between the two tails of the curve: * **Left Tail:** 2.5% probability (value is lower than the CI). * **Right Tail:** 2.5% probability (value is higher than the CI). The question specifically asks for the probability of falling to the **right** (one tail only). * Calculation: 2.5% = 2.5/100 = 1/40. Thus, there is a **1 in 40** chance that the factor falls specifically to the right of the 95% CI. **2. Why Other Options are Incorrect** * **Option A (1 in 5):** Represents a 20% probability, which corresponds to an 80% CI. * **Option B (1 in 10):** Represents a 10% probability, which is the total error for a 90% CI. * **Option C (1 in 20):** This is the **total probability** (5%) of the value falling outside the 95% CI (both tails combined). It is a common distractor for students who forget to divide by two for a single tail. **3. High-Yield Clinical Pearls for NEET-PG** * **Confidence Interval vs. P-value:** If the 95% CI for a Relative Risk (RR) or Odds Ratio (OR) includes **1**, the result is not statistically significant (p > 0.05). * **Width of CI:** A narrower CI indicates a larger sample size and greater precision. * **Standard Error:** The CI is calculated using the Mean ± (1.96 × Standard Error). * **Rule of Thumb:** * 95% CI = Mean ± 2 SE (approx.) * 99% CI = Mean ± 2.58 SE * 68% CI = Mean ± 1 SE

Q: A diagnostic test with high sensitivity and low specificity will result in which of the following?

High true positive rate. ### Explanation **1. Why Option A is Correct:** Sensitivity is defined as the ability of a test to correctly identify those with the disease. Mathematically, it is the **True Positive Rate (TPR)**: $[TP / (TP + FN)]$. A test with **high sensitivity** ensures that most individuals who actually have the disease will test positive. Therefore, high sensitivity directly correlates with a high true positive rate. **2. Analysis of Incorrect Options:** * **Option B (High false negative rate):** This is incorrect. Sensitivity and False Negative Rate (FNR) are complementary ($Sensitivity + FNR = 100\%$). Therefore, a high sensitivity test results in a **low false negative rate**, making it excellent for "screening" to ensure no cases are missed. * **Option C (Low true negative rate):** While the question states the test has low specificity (which means a low True Negative Rate), the primary and most direct consequence of *high sensitivity* is the high true positive rate. In the context of NEET-PG, always prioritize the direct definition of the primary parameter mentioned. * **Option D (Low true positive rate):** This is the opposite of the definition of sensitivity. **3. NEET-PG High-Yield Pearls:** * **SNOUT:** **S**ensitivity rules **OUT** disease (due to low false negatives). * **SPIN:** **S**pecificity rules **IN** disease (due to low false positives). * **Screening vs. Diagnosis:** High sensitivity tests are preferred for **screening** (e.g., ELISA for HIV), whereas high specificity tests are used for **confirmation** (e.g., Western Blot for HIV). * **Relationship:** Sensitivity is inversely proportional to the False Negative Rate; Specificity is inversely proportional to the False Positive Rate.

Q: Cause to effect progression is seen in all EXCEPT?

Case control study. In epidemiology, the direction of an inquiry is defined by whether the researcher moves from the exposure (cause) to the outcome (effect) or vice versa. ### **Why Case-Control Study is the Correct Answer** A **Case-Control study** is primarily **retrospective** in nature. It begins with the **effect** (identifying individuals who already have the disease/cases and those who do not/controls) and looks backward in time to determine the **cause** (prior exposure). Therefore, it follows an **Effect to Cause** progression, making it the exception in this list. ### **Analysis of Incorrect Options** * **Cohort Study:** This is the classic **Cause to Effect** design. It starts with a group of exposed and unexposed individuals (cause) and follows them forward in time to see who develops the disease (effect). * **Randomized Control Trial (RCT):** As an experimental study, the investigator intentionally introduces an intervention (cause) and monitors the subjects for a specific outcome (effect). It follows a **Cause to Effect** progression. * **Ecological Study:** These studies look at the association between an exposure and an outcome at a population level. While they are descriptive, they generally analyze how a factor (cause) correlates with the frequency of a disease (effect) in a population. ### **High-Yield Clinical Pearls for NEET-PG** * **Directionality:** * Cohort/RCT: Forward (Cause $\rightarrow$ Effect) * Case-Control: Backward (Effect $\rightarrow$ Cause) * Cross-sectional: Snapshot (Cause and Effect at the same time) * **Best Study for Rare Diseases:** Case-Control (starts with cases). * **Best Study for Rare Exposures:** Cohort (starts with exposed group). * **Incidence:** Can be calculated in Cohort studies but **not** in Case-Control studies (which calculate Odds Ratio).

Q: The same screening test is applied to two communities, X and Y. Community Y shows more false-positive cases compared to community X. What is the most likely reason for this difference?

Community Y has a lower prevalence of the condition. ### Explanation The correct answer is **D. Community Y has a lower prevalence of the condition.** #### 1. Why the correct answer is right The number of false positives in a screening program is inversely related to the **Prevalence** of the disease in the population. * **Positive Predictive Value (PPV)** is the probability that a person with a positive test actually has the disease. PPV is directly proportional to prevalence. * When prevalence is **low** (as in Community Y), the PPV drops. This means that out of all positive results generated by the test, a larger proportion will be **False Positives**. * In a low-prevalence setting, the test "hunts" for rare cases among many healthy individuals, increasing the mathematical likelihood that a positive result is a false alarm. #### 2. Why the incorrect options are wrong * **A & B (Sensitivity and Specificity):** These are **inherent properties** of the screening test itself. Since the question states the *same* test is used in both communities, the sensitivity and specificity remain constant and cannot account for the difference in results between X and Y. * **C (Higher Prevalence):** If Community Y had a higher prevalence, the PPV would increase, leading to *fewer* false positives and more true positives. #### 3. High-Yield Clinical Pearls for NEET-PG * **Prevalence vs. Predictive Values:** * Prevalence ↑ = PPV ↑ and NPV ↓ * Prevalence ↓ = PPV ↓ and NPV ↑ * **Specificity vs. False Positives:** While prevalence affects the *proportion* of false positives among all positives (PPV), the total number of false positives in a population is primarily determined by the **Specificity** (False Positive Rate = 1 - Specificity). * **Screening Strategy:** To minimize false positives in a low-prevalence community, clinicians should use a test with very high **Specificity**.

Q: A stem and leaf plot is a variant of which type of graphical representation?

Histogram. ### Explanation **1. Why Histogram is the Correct Answer:** A **Stem and Leaf Plot** is essentially a "textual histogram" turned on its side. Like a histogram, it displays the **distribution and frequency** of continuous numerical data. * **The Concept:** In a histogram, data is grouped into bins (intervals) represented by bars. In a stem and leaf plot, the "Stem" represents the bin (e.g., tens digit) and the "Leaf" represents the individual data points (e.g., units digit). * **The Advantage:** While a histogram shows the shape of the distribution, it loses individual raw data points. A stem and leaf plot retains the exact numerical values while simultaneously showing the shape (density) of the distribution, making it a hybrid of a table and a graph. **2. Why Other Options are Incorrect:** * **B. Frequency Polygon:** This is a line graph formed by joining the midpoints of the tops of the bars of a histogram. It is used to compare two or more distributions on the same axes, whereas a stem and leaf plot focuses on the raw data of a single distribution. * **C. Pie Diagram:** This represents qualitative (categorical) data as proportions of a whole (360 degrees). It does not show the distribution of continuous numerical values or individual data points. **3. NEET-PG High-Yield Pearls:** * **Data Type:** Stem and leaf plots are used for **quantitative (numerical)** data, specifically when the dataset is small to moderate in size. * **Shape Identification:** Just like a histogram, you can identify if a distribution is **Symmetrical, Positively Skewed, or Negatively Skewed** by looking at the "leaves." * **Quick Tip:** If you rotate a stem and leaf plot 90 degrees counter-clockwise, the silhouette of the leaves perfectly mimics the bars of a histogram.

Q: While calculating the incubation period for measles in a group of 25 children, the standard deviation is 2 and the mean incubation period is 8 days. Calculate the standard error.

0.4. ### Explanation **1. Understanding the Concept and Calculation** The **Standard Error (SE)**, specifically the Standard Error of the Mean, measures the dispersion of sample means around the true population mean. It indicates how much the sample mean is likely to vary from the actual population mean. The formula for Standard Error is: $$SE = \frac{SD}{\sqrt{n}}$$ *Where $SD$ = Standard Deviation and $n$ = Sample size.* **Given in the question:** * Standard Deviation ($SD$) = 2 * Sample size ($n$) = 25 * Mean = 8 days (Note: The mean is provided to test your ability to filter relevant data; it is not used in the SE formula). **Calculation:** $$SE = \frac{2}{\sqrt{25}} = \frac{2}{5} = 0.4$$ Thus, the correct answer is **0.4**. **2. Analysis of Incorrect Options** * **Option B (1):** This result would occur if the sample size were 4 ($2/\sqrt{4} = 1$). * **Option C (0.5):** This is a common calculation error where the square root of the sample size is ignored ($2/4$ instead of $2/5$). * **Option D (2):** This is the value of the Standard Deviation itself, not the Standard Error. **3. NEET-PG High-Yield Pearls** * **SE vs. SD:** Standard Deviation describes the variability **within a single sample**, whereas Standard Error describes the variability **between different sample means**. * **Relationship with $n$:** As the sample size ($n$) increases, the Standard Error decreases, leading to higher precision. * **Confidence Intervals:** SE is used to calculate Confidence Intervals (CI). For a 95% CI, the formula is $Mean \pm 1.96 \times SE$. * **Measles Fact:** While this is a biostatistics question, remember for PSM that the median incubation period for Measles is typically **10 days** (range 7–14 days), and it is most infectious during the **prodromal/pre-eruptive stage**.

Q: In a population of 10,000 individuals, 20% have a specific disease. A screening test for this disease has a sensitivity of 95% and a specificity of 80%. What is the Positive Predictive Value (PPV) of this test?

54.30%. ### Explanation To calculate the **Positive Predictive Value (PPV)**, we must determine the probability that a person has the disease given a positive test result. This is calculated using the formula: $$PPV = \frac{\text{True Positives (TP)}}{\text{Total Test Positives (TP + FP)}} \times 100$$ **Step-by-Step Calculation:** 1. **Prevalence:** 20% of 10,000 = **2,000 diseased** individuals; **8,000 healthy** individuals. 2. **True Positives (TP):** Sensitivity is 95%. So, 95% of 2,000 = **1,900**. 3. **False Positives (FP):** Specificity is 80%, meaning the False Positive Rate is 20% (100-80). So, 20% of 8,000 = **1,600**. 4. **PPV:** $\frac{1,900}{1,900 + 1,600} = \frac{1,900}{3,500} = \mathbf{54.28\%}$ (rounded to 54.30%). --- ### Analysis of Options: * **A (Correct):** 54.30% is the result of applying the prevalence to the test's diagnostic accuracy. * **B (98.50%):** This is an overestimation. High sensitivity (95%) does not equate to high PPV if the specificity is relatively low (80%) or the disease is rare. * **C (47.50%):** This represents the TP (1,900) divided by the total diseased population (4,000) if prevalence were 40%, or a calculation error involving sensitivity. * **D (20.00%):** This is simply the prevalence (Pre-test probability). --- ### NEET-PG High-Yield Pearls: * **Prevalence Dependency:** PPV is directly proportional to the prevalence of the disease in the population. If prevalence increases, PPV increases. * **Screening vs. Diagnosis:** Sensitivity and Specificity are inherent properties of the test, while PPV and NPV (Negative Predictive Value) depend on the population's disease burden. * **Clinical Utility:** PPV is the most useful measure for a clinician when a patient asks, "My test is positive; what are the chances I actually have the disease?"

Q: Which of the following statements about the P-value is incorrect?

It is equal to 1-Beta.. **Explanation** The P-value is a fundamental concept in biostatistics used to determine the strength of evidence against the null hypothesis ($H_0$). **Why Option B is the Correct Answer (Incorrect Statement):** The statement "P-value is equal to $1-\beta$" is incorrect. In statistics, **$1-\beta$** represents the **Power of a study**, which is the probability of correctly rejecting a false null hypothesis (detecting a difference when one truly exists). The P-value, conversely, is related to the Type I error ($\alpha$). **Analysis of Other Options:** * **Option A & C:** These are correct definitions of the P-value. It represents the probability of committing a **Type I error** (False Positive)—the chance of concluding that a significant difference exists when, in reality, the observed difference is due to random chance alone. * **Option D:** This is the standard rule for significance. If the **P-value < $\alpha$** (usually 0.05), we reject the null hypothesis and conclude the result is **statistically significant**. **High-Yield Clinical Pearls for NEET-PG:** * **Type I Error ($\alpha$):** "Producer’s Risk." Rejecting the null hypothesis when it is true (False Positive). * **Type II Error ($\beta$):** "Consumer’s Risk." Failing to reject the null hypothesis when it is false (False Negative). * **Power ($1-\beta$):** Ability of a test to detect a difference. It is increased by increasing the sample size. * **Confidence Interval (CI):** If the 95% CI for a difference between means includes **0**, or if the CI for Odds Ratio/Relative Risk includes **1**, the result is NOT statistically significant (corresponds to $P > 0.05$).

Question 1

All of the following are true of the standardized mortality ratio except:

Accepted Answer

It is expressed as a rate per year.

Answer

It can be adjusted for age.

Answer

It can be used for events other than death.

Answer

It is the ratio of observed deaths to expected deaths.

Question 2

A study was conducted in a coastal African country involving 274 soldiers stationed in three different camps to examine the presence of bacterial sexually transmitted diseases (STD) and human immunodeficiency virus (HIV) positivity. Data from clinical exams, laboratory specimens, and interviews regarding age, years of military service, ethnicity, and region of origin were collected. What is the most accurate description of this study design?

Accepted Answer

A cross-sectional study

Answer

A case-control study

Answer

A cohort study

Answer

A clinical trial

Question 3

What is the probability that a confounding factor falls to the right of the 95% confidence interval?

Accepted Answer

1 in 40

Answer

1 in 5

Answer

1 in 10

Answer

1 in 20

Question 4

A diagnostic test with high sensitivity and low specificity will result in which of the following?

Accepted Answer

High true positive rate

Answer

High false negative rate

Answer

Low true negative rate

Answer

Low true positive rate

Question 5

Cause to effect progression is seen in all EXCEPT?

Accepted Answer

Case control study

Answer

Ecological study

Answer

Cohort study

Answer

Randomized control trial

Question 6

The same screening test is applied to two communities, X and Y. Community Y shows more false-positive cases compared to community X. What is the most likely reason for this difference?

Accepted Answer

Community Y has a lower prevalence of the condition

Answer

High sensitivity of the test

Answer

High specificity of the test

Answer

Community Y has a higher prevalence of the condition

Question 7

A stem and leaf plot is a variant of which type of graphical representation?

Accepted Answer

Histogram

Answer

Frequency Polygon

Answer

Pie Diagram

Answer

None of the above

Question 8

While calculating the incubation period for measles in a group of 25 children, the standard deviation is 2 and the mean incubation period is 8 days. Calculate the standard error.

Accepted Answer

0.4

Answer

1

Answer

0.5

Answer

2

Question 9

In a population of 10,000 individuals, 20% have a specific disease. A screening test for this disease has a sensitivity of 95% and a specificity of 80%. What is the Positive Predictive Value (PPV) of this test?

Accepted Answer

54.30%

Answer

98.50%

Answer

47.50%

Answer

20.00%

Question 10

Which of the following statements about the P-value is incorrect?

Accepted Answer

It is equal to 1-Beta.

Answer

It is the probability of committing a Type I error.

Answer

It is the chance that the presence of a difference is concluded when actually there is none.

Answer

When the P-value is less than alpha, the result is statistically significant.

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?