Biostatistics Practice Questions

Q: Out of 11 births in a hospital, 5 babies weighed over 2.5 kg and 5 weighed less than 2.5 kg. What value does 2.5 represent?

Median. ### Explanation **Correct Answer: C. Median** The **Median** is defined as the middle-most value in a distribution when the data points are arranged in ascending or descending order. It divides the distribution into two equal halves, such that 50% of the observations lie above it and 50% lie below it. In this scenario: * Total births ($n$) = 11. * Observations above 2.5 kg = 5. * Observations below 2.5 kg = 5. * The 2.5 kg value occupies the 6th position (the exact center), making it the median. --- ### Why other options are incorrect: * **Arithmetic Average (Mean):** This is the sum of all observations divided by the total number of observations. We cannot calculate the mean here because the specific weights of the other 10 babies are not provided. * **Geometric Average:** This is the $n^{th}$ root of the product of all observations. It is typically used for rates and ratios (e.g., bacterial growth or population growth) and is not applicable here. * **Mode:** This represents the most frequently occurring value in a dataset. The question does not state that 2.5 kg is the most common weight, only that it is the central point. --- ### High-Yield Clinical Pearls for NEET-PG: 1. **Best Measure of Central Tendency:** * For **Normally distributed (symmetrical) data**: Mean is preferred. * For **Skewed data (outliers)**: Median is the most robust measure as it is not affected by extreme values. 2. **Relationship in Skewed Data:** * **Right (Positive) Skew:** Mean > Median > Mode. * **Left (Negative) Skew:** Mode > Median > Mean. 3. **Median Calculation:** For an odd number of observations, the median is the $(\frac{n+1}{2})^{th}$ value. For an even number, it is the average of the two middle values.

Q: If the mean is 209, the median is 196, and the mode is 135, what type of distribution does this indicate?

Positively skewed. ### Explanation **1. Why Positively Skewed is Correct:** In a frequency distribution, the relationship between the three measures of central tendency (Mean, Median, and Mode) determines the "skewness" or asymmetry of the curve. * **The Rule:** In a **Positively Skewed** distribution (also known as Right-skewed), the tail of the curve extends toward the higher values (right side). * **The Relationship:** **Mean > Median > Mode**. * **In this question:** Mean (209) > Median (196) > Mode (135). Since the mean is pulled toward the higher extreme values, it confirms a positive skew. **2. Why Incorrect Options are Wrong:** * **Standard Curve (Normal Distribution):** In a perfectly symmetrical bell-shaped curve, the **Mean = Median = Mode**. Here, the values are significantly different. * **Negatively Skewed:** In a left-skewed distribution, the tail extends toward the lower values. The relationship is reversed: **Mean Median). * **Sensitivity to Outliers:** The **Mean** is the most affected by extreme values (outliers), while the **Mode** is the least affected. * **Karl Pearson’s Formula:** For moderately skewed distributions: * $Mode = (3 \times Median) - (2 \times Mean)$ * **Clinical Example:** Income distribution or incubation periods of most infectious diseases (like Salmonellosis) typically show a positive skew.

Q: Which of the following is a pre-requisite for the Chi-square test to compare two samples?

Both samples should be mutually exclusive. ### Explanation The **Chi-square ($\chi^2$) test** is a non-parametric test used to compare proportions or determine the association between categorical variables. **Why Option A is Correct:** A fundamental assumption of the Chi-square test is the **independence of observations**. This means that each individual or observation must fall into one, and only one, category. The samples must be **mutually exclusive**; an individual cannot belong to both groups being compared (e.g., a patient cannot be in both the 'Treatment' group and the 'Placebo' group simultaneously). If the samples were related or paired (e.g., pre-test and post-test results for the same person), the McNemar Chi-square test would be used instead. **Why Other Options are Incorrect:** * **Option B:** If samples are not mutually exclusive, the assumption of independence is violated, leading to an overestimation of the significance (Type I error). * **Option C:** The Chi-square test is **non-parametric**, meaning it does not require the data to follow a **Normal (Gaussian) distribution**. It is "distribution-free" and deals with frequencies rather than mean values. **High-Yield Clinical Pearls for NEET-PG:** * **Qualitative Data:** Chi-square is the most common test for qualitative/categorical data (e.g., Male vs. Female, Cured vs. Not Cured). * **Yates’ Correction:** Applied when the sample size is small or any expected cell frequency is **< 5** in a 2x2 table. * **Degrees of Freedom (df):** For a contingency table, $df = (r-1) \times (c-1)$. For a 2x2 table, $df = 1$. * **Null Hypothesis:** The Chi-square test assumes there is no association between the variables; a p-value < 0.05 rejects this hypothesis.

Q: What type of data description is represented by a stem-and-leaf diagram?

Stem and leaf diagram. ### Explanation **Correct Answer: A. Stem and leaf diagram** **Understanding the Concept:** A **Stem-and-Leaf diagram** is a unique hybrid tool in biostatistics that acts as both a **tabular and a graphical representation** of data. It is used to describe the **frequency distribution** of a quantitative dataset. * The "Stem" represents the leading digit(s) (e.g., tens), and the "Leaf" represents the trailing digit (e.g., units). * Unlike a histogram, it **retains the individual raw data values** while simultaneously showing the shape of the distribution (skewness, outliers, and modal class). This makes it an excellent tool for small to medium-sized datasets where seeing every data point is necessary. **Why the other options are incorrect:** * **B. Box Whisker Plot:** This is used to represent the **five-number summary** of a dataset (Minimum, First Quartile, Median, Third Quartile, and Maximum). It is primarily used to visualize dispersion and identify outliers, but it does not show individual raw data points like a stem-and-leaf plot. * **C. Forrest Plot:** This is a graphical display used specifically in **Meta-analysis**. It illustrates the individual results (odds ratios/relative risks) of multiple studies and provides a "pooled" or summary effect size. * **D. Funnel Plot:** This is a scatter plot used in Meta-analysis to detect **Publication Bias**. A symmetrical funnel indicates no bias, while an asymmetrical funnel suggests the presence of bias. **NEET-PG High-Yield Pearls:** * **Stem-and-Leaf vs. Histogram:** Both show distribution shape, but only the Stem-and-Leaf plot preserves the original data values. * **Quantitative Data Tools:** Histograms, Frequency Polygons, and Box plots are for quantitative data. * **Qualitative Data Tools:** Bar charts, Pie charts, and Pictograms are for qualitative data. * **Scatter Diagram:** Used to show the **correlation** between two continuous variables.

Q: A party was attended by 100 people. 60 cases of food poisoning were reported, and 12 deaths were recorded. What is the case fatality rate in this scenario?

20%. ### Explanation **1. Understanding the Correct Answer (C: 20%)** The **Case Fatality Rate (CFR)** measures the virulence or killing power of a disease. It is defined as the proportion of deaths from a specific disease compared to the total number of people diagnosed with that disease during a specific period. The formula is: $$\text{CFR} = \frac{\text{Total number of deaths due to a disease}}{\text{Total number of cases of that disease}} \times 100$$ In this scenario: * Total cases of food poisoning = 60 * Total deaths = 12 * Calculation: $(12 / 60) \times 100 = \mathbf{20\%}$ **2. Why Other Options are Incorrect** * **A (6%):** This value is obtained by dividing deaths (12) by the total population (200) if the denominator was doubled, or it represents a miscalculation of the Cause-Specific Mortality Rate. * **B (15%):** This is a mathematical error and does not correspond to any standard epidemiological indicator in this data set. * **D (30%):** This might be confused with the Attack Rate if the number of cases (60) was divided by a different denominator, but it is mathematically incorrect for CFR here. **3. NEET-PG High-Yield Pearls** * **CFR vs. Mortality Rate:** CFR is a **ratio** (often expressed as a percentage), not a true rate, because time is not explicitly in the denominator. * **Denominator Importance:** The denominator for CFR is always the **number of cases**, whereas for the Mortality Rate, it is the **total mid-year population**. * **Clinical Significance:** CFR is the best indicator of the **severity** of an acute infectious disease and the effectiveness of treatment. * **Attack Rate:** In this scenario, the Attack Rate would be $(60 / 100) \times 100 = 60\%$.

Q: If neonatal deaths are 450, stillbirths are 212, and total live births are 12450, what is the Neonatal Mortality Rate (NMR)?

36. **Explanation:** The **Neonatal Mortality Rate (NMR)** is a key indicator of newborn care and maternal health. It is defined as the number of deaths of live-born infants during the first 28 completed days of life per 1,000 live births in a given year. **Calculation:** * **Formula:** (Number of neonatal deaths / Total live births) × 1000 * **Data provided:** Neonatal deaths = 450; Live births = 12,450. * **Calculation:** (450 / 12,450) × 1000 = **36.14 per 1000 live births.** * Rounding to the nearest whole number gives **36**. **Analysis of Options:** * **Option B (36):** Correct. This follows the standard formula using only live births in the denominator. * **Option A (17):** Incorrect. This value does not correlate with the provided data. * **Option C (64):** Incorrect. This value is approximately the **Perinatal Mortality Rate (PMR)**. PMR includes (Stillbirths + Early Neonatal Deaths) / (Live births + Stillbirths) × 1000. If one mistakenly adds stillbirths to the numerator and total births to the denominator, they arrive at a higher figure (~52), but 64 is mathematically incorrect for NMR. * **Option D (92):** Incorrect. This is a distractor value significantly higher than typical NMR ranges. **High-Yield Clinical Pearls for NEET-PG:** * **Denominator Rule:** For NMR, IMR, and U5MR, the denominator is always **Live Births**. For Perinatal Mortality Rate and Maternal Mortality Rate (Ratio), the denominator includes **Total Births** (Live births + Stillbirths). * **Early Neonatal Death:** Death within 0–7 days of birth. * **Late Neonatal Death:** Death between 7–28 days of birth. * **Most common cause of Neonatal Mortality in India:** Prematurity and low birth weight (followed by birth asphyxia and sepsis).

Q: Which statistical test is appropriate for comparing ten blood pressure readings taken before and after treatment?

Paired t-test. **Explanation:** The correct answer is **A. Paired t-test**. In this scenario, the data consists of quantitative (numerical) measurements—blood pressure—taken from the **same group of individuals** at two different points in time (before and after treatment). This is a classic example of **"paired" or "dependent" data**, where each "before" value has a direct match with an "after" value. The Paired t-test is specifically designed to compare the means of two related groups to determine if the intervention (treatment) caused a statistically significant change. **Why other options are incorrect:** * **Z-test:** Used for comparing means when the sample size is large (typically **n > 30**) and the population variance is known. Here, the sample size is small (n=10). * **Student’s t-test (Unpaired/Independent):** Used to compare the means of two **independent** groups (e.g., comparing BP between Group A and Group B). It does not account for the relationship between "before and after" readings in the same person. * **Correlation test:** Measures the strength and direction of a linear relationship between two variables (e.g., height and weight), but it does not compare means or determine the effectiveness of a treatment. **High-Yield Clinical Pearls for NEET-PG:** * **Quantitative Data + 2 Groups (Dependent/Matched):** Paired t-test. * **Quantitative Data + 2 Groups (Independent):** Unpaired t-test. * **Quantitative Data + >2 Groups:** ANOVA (Analysis of Variance). * **Qualitative (Categorical) Data:** Chi-square test. * If the data is not normally distributed (Non-parametric), the alternative to the Paired t-test is the **Wilcoxon Signed Rank Test**.

Q: Which of the following is NOT a method of survival analysis?

Survival rate. ### Explanation The core of this question lies in distinguishing between a **statistical method (test)** and a **descriptive measure (parameter)**. **Why "Survival Rate" is the correct answer:** Survival rate is a **descriptive statistic** (a proportion or percentage) that indicates the fraction of people in a study group who are alive after a certain period (e.g., 5-year survival rate). It is an **outcome measure**, not a mathematical method or test used to analyze time-to-event data. **Analysis of Incorrect Options:** * **A. Kaplan-Meier Method:** This is the most common **non-parametric** method used to estimate the survival function. It uses "product-limit" calculations and is ideal for small samples where the exact time of death/event is known for each subject. * **B. Actuarial Method (Life Table):** Also known as the "Interval-based" method, it is used for large samples. It calculates survival probabilities over fixed time intervals (e.g., 1 year, 5 years) rather than at the exact time of each event. * **C. Kruskal-Wallis Test:** While this is a non-parametric test, it is used to compare the medians of **three or more independent groups**. It is essentially the non-parametric alternative to one-way ANOVA. Since it can be used to compare survival times across multiple groups, it is considered a tool within the broader scope of survival analysis. **High-Yield NEET-PG Pearls:** * **Survival Analysis:** Used when the outcome of interest is the **time** until an event occurs (Time-to-event data). * **Censoring:** A unique feature of survival analysis where the event has not occurred for a subject by the end of the study or they are lost to follow-up. * **Log-Rank Test:** The most common statistical test used to compare the survival curves of two or more groups. * **Cox Proportional Hazards Model:** A semi-parametric regression method used to investigate the relationship between survival time and several predictor variables.

Question 1

The American Diabetes Association (ADA) recently lowered the cut-off value for fasting glucose used in diagnosing diabetes mellitus from 140 mg/dL to 126 mg/dL. This reference interval change would be expected to produce which of the following alterations?

Accepted Answer

Increase the test's negative predictive value

Answer

Decrease the test's sensitivity

Answer

Increase the test's false negative rate

Answer

Increase the test's positive predictive value

Question 2

Out of 11 births in a hospital, 5 babies weighed over 2.5 kg and 5 weighed less than 2.5 kg. What value does 2.5 represent?

Accepted Answer

Median

Answer

Geometric average

Answer

Arithmetic average

Answer

Mode

Question 3

If the mean is 209, the median is 196, and the mode is 135, what type of distribution does this indicate?

Accepted Answer

Positively skewed

Answer

Standard curve

Answer

Negatively skewed

Answer

J-shaped

Question 4

Which of the following is a pre-requisite for the Chi-square test to compare two samples?

Accepted Answer

Both samples should be mutually exclusive

Answer

Both samples need not be mutually exclusive

Answer

Normal distribution

Answer

All of the above

Question 5

What type of data description is represented by a stem-and-leaf diagram?

Accepted Answer

Stem and leaf diagram

Answer

Box whisker plot

Answer

Forrest plot

Answer

Funnel plot

Question 6

A party was attended by 100 people. 60 cases of food poisoning were reported, and 12 deaths were recorded. What is the case fatality rate in this scenario?

Accepted Answer

20%

Answer

6%

Answer

15%

Answer

30%

Question 7

What is the minimum number of newborns that should be examined in a given population to accurately calculate the percentage of low birth weight (LBW) babies?

Accepted Answer

500 babies

Answer

100 babies

Answer

1000 babies

Answer

10,000 babies

Question 8

If neonatal deaths are 450, stillbirths are 212, and total live births are 12450, what is the Neonatal Mortality Rate (NMR)?

Accepted Answer

36

Answer

17

Answer

64

Answer

92

Question 9

Which statistical test is appropriate for comparing ten blood pressure readings taken before and after treatment?

Accepted Answer

Paired t-test

Answer

Z-test

Answer

Student's t-test

Answer

Correlation test

Question 10

Which of the following is NOT a method of survival analysis?

Accepted Answer

Survival rate

Answer

Kaplan-Meier method

Answer

Actuarial method

Answer

Kruskal-Wallis test

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?