Biostatistics Practice Questions

Q: The histogram is used as a method of group presentation for which type of data?

Quantitative continuous data. ### Explanation **1. Why the Correct Answer is Right:** A **Histogram** is a graphical representation of a frequency distribution for **quantitative continuous data**. In continuous data (like height, weight, or hemoglobin levels), variables can take any value within a range. To represent this, data is grouped into adjacent class intervals (e.g., 10-20, 20-30). In a histogram, the area of each rectangle is proportional to the frequency. Because the data is continuous, there are **no gaps** between the bars, signifying that the variable flows transitionally from one interval to the next. **2. Why the Other Options are Wrong:** * **Qualitative Data (Option A):** This refers to attributes or categories (e.g., gender, blood group). These are best represented by **Bar charts** or **Pie charts**. * **Quantitative Discrete Data (Option C):** Discrete data involves whole numbers (e.g., number of children in a family, number of hospital beds). Since there are no intermediate values between 1 and 2, these are represented by **Bar charts** with gaps between the bars to show the distinct nature of the data. * **Nominal Data (Option D):** This is a subtype of qualitative data where there is no inherent order (e.g., religion, state of residence). Like other qualitative data, it is represented by **Bar charts** or **Pie charts**, not histograms. **3. High-Yield Clinical Pearls for NEET-PG:** * **Histogram vs. Bar Chart:** The most common "trap" in NEET-PG. Remember: **Histogram = No Gaps (Continuous)**; **Bar Chart = Gaps (Discrete/Qualitative)**. * **Frequency Polygon:** Created by joining the midpoints of the tops of the bars in a histogram. * **Line Diagram:** Best for showing **trends over time** (e.g., maternal mortality rate over a decade). * **Scatter Diagram:** Used to show the **correlation** between two quantitative variables. * **Ogives:** Used to represent cumulative frequency.

Q: Intraocular pressure (IOP) was measured in 400 people. The mean IOP was found to be 25 mm Hg and the standard deviation was recorded as 10 mm Hg. What is the 95% Confidence Interval for the mean IOP?

24-26 mm Hg. ### Explanation To solve this question, we must calculate the **95% Confidence Interval (CI)** for the population mean. The formula for the 95% CI is: **Mean ± (1.96 × Standard Error)** *(For NEET-PG calculations, 1.96 is usually rounded to 2).* **Step 1: Calculate the Standard Error (SE)** Standard Error measures the dispersion of sample means around the population mean. $SE = \frac{SD}{\sqrt{n}}$ $SE = \frac{10}{\sqrt{400}} = \frac{10}{20} = 0.5$ **Step 2: Calculate the Confidence Interval** $95\% CI = Mean \pm (2 \times SE)$ $95\% CI = 25 \pm (2 \times 0.5)$ $95\% CI = 25 \pm 1$ **Result: 24 to 26 mm Hg.** --- #### Analysis of Options: * **Option C (Correct):** Correctly applies the SE formula and the multiplier for 95% confidence (Mean ± 2 SE). * **Option A (22-28):** This represents Mean ± 3 SE (approx. 99.7% CI). * **Option B (23-27):** This represents Mean ± 2 SD (the range for 95% of individuals in a normal distribution), but the question asks for the **CI of the mean**, which requires using SE, not SD. * **Option D (21-29):** This represents Mean ± 4 SE, which is not a standard statistical interval. --- #### High-Yield Clinical Pearls for NEET-PG: 1. **SD vs. SE:** Use **Standard Deviation (SD)** to describe the spread of individual data points. Use **Standard Error (SE)** to describe the precision of the sample mean compared to the population mean. 2. **CI Multipliers:** * 95% CI = Mean ± 2 SE (Exact: 1.96) * 99% CI = Mean ± 2.58 SE * 68% CI = Mean ± 1 SE 3. **Sample Size Impact:** As the sample size ($n$) increases, the SE decreases, resulting in a narrower (more precise) Confidence Interval.

Q: What does random error primarily reduce?

Precision. ### Explanation In biostatistics, the quality of a measurement is determined by its **reliability (precision)** and **validity (accuracy)**. **Why Precision is the Correct Answer:** **Precision** refers to the consistency or reproducibility of a measurement. **Random errors** are unpredictable fluctuations caused by unknown or uncontrollable variables (e.g., observer variability or sampling error). Because random errors cause results to scatter around the true mean, they directly reduce the consistency of the data. Therefore, the more random error present, the lower the precision. Precision can be improved by increasing the sample size. **Why the Other Options are Incorrect:** * **A. Systemic Error:** This is a consistent, repeatable error usually caused by faulty equipment or study design. It is the opposite of random error. * **B. Bias:** Bias is a type of systemic error that results in a deviation from the truth. While random error affects precision, bias affects **accuracy**. A study can be highly precise (consistent) but still biased (inaccurate). * **C. Confounding Factor:** This is a variable that distorts the relationship between the exposure and the outcome because it is associated with both. Confounding is a type of systemic bias, not a result of random chance. **High-Yield Clinical Pearls for NEET-PG:** * **Random Error** $\propto$ 1 / Precision (Reduced by increasing sample size). * **Systemic Error (Bias)** $\propto$ 1 / Accuracy (Reduced by better study design/randomization). * **Target Analogy:** * Tightly grouped shots away from the bullseye = High Precision, Low Accuracy (Bias). * Shots scattered all over the target = Low Precision (Random Error). * Tightly grouped shots in the bullseye = High Precision, High Accuracy. * **P-value** is the probability of obtaining results due to random error (chance).

Q: The number of live births per 1000 women in the reproductive age group in a year refers to:

General fertility rate. ### Explanation **1. Why General Fertility Rate (GFR) is Correct:** The General Fertility Rate is a more refined measure of fertility than the Crude Birth Rate because it relates births to the specific segment of the population capable of giving birth. The denominator is restricted to women in the reproductive age group (conventionally **15–44 or 15–49 years**). * **Formula:** $\frac{\text{Total number of live births in an area during a year}}{\text{Mid-year female population aged 15–49 years}} \times 1000$ **2. Why Other Options are Incorrect:** * **Total Fertility Rate (TFR):** This represents the average number of children a woman would have if she were to pass through her reproductive years bearing children according to the current age-specific fertility rates. It is a completed family size projection, not a rate per 1000 women. * **Gross Reproduction Rate (GRR):** This is similar to TFR but counts only **female births**. It indicates how many daughters a woman would have, assuming no mortality before the end of the reproductive period. * **Net Reproduction Rate (NRR):** This is the GRR adjusted for mortality. it represents the number of daughters a newborn girl will bear during her lifetime, accounting for the risk of her dying before completing her reproductive cycle. An **NRR of 1** is the demographic goal for population stabilization. **3. High-Yield Clinical Pearls for NEET-PG:** * **Crude Birth Rate (CBR):** Uses the *total mid-year population* as the denominator. It is the simplest but least accurate measure of fertility. * **Replacement Level Fertility:** Defined as an **NRR of 1** or a **TFR of 2.1**. * **Best Indicator of Fertility:** TFR is considered the best single indicator to compare fertility levels between different populations. * **Denominator Check:** Always look at the denominator in biostatistics questions. If it’s "women 15–49," it points toward GFR; if it's "total population," it's CBR.

Q: In a negatively skewed distribution, which statement is true?

Mean is less than the median. ### Explanation In biostatistics, the relationship between the **Mean, Median, and Mode** changes depending on the symmetry of the data distribution. **1. Why the Correct Answer is Right (Option C):** In a **negatively skewed distribution** (also known as "left-skewed"), the tail of the distribution extends toward the lower (negative) end of the scale. This occurs because there are a few extremely low values (outliers) that pull the **Mean** downward. * The **Mode** remains at the peak (highest frequency). * The **Median** stays in the middle as a measure of central position. * The **Mean** is most affected by outliers and is dragged toward the tail. Therefore, the mathematical relationship is: **Mean Median > Mode). * **Option B:** This occurs only in a **Normal (Symmetrical) Distribution**, where Mean = Median = Mode. * **Option D:** In a negatively skewed distribution, the Mode is actually the **highest** value (Mode > Median > Mean), so the statement that the Mode is less than the Median is incorrect. **3. NEET-PG High-Yield Pearls:** * **Memory Aid:** The "Mean" is the most "sensitive" (or "mean")—it always follows the tail. If the tail is on the left (negative), the Mean is the smallest. * **Best Measure of Central Tendency:** * For skewed data: **Median** (it is "robust" and not affected by outliers). * For nominal data: **Mode**. * For normal distribution: **Mean**. * **Visual Cue:** In a graph, the order from the tail to the peak is always **Mean → Median → Mode**.

Q: Which of the following parametric tests can be used for comparing means in more than two different groups of individuals?

ANOVA test. **Explanation:** The core concept in this question is identifying the appropriate statistical test based on the **number of groups** and the **type of data** being compared. **Why ANOVA is correct:** **ANOVA (Analysis of Variance)**, specifically One-way ANOVA, is the standard parametric test used to compare the means of **three or more independent groups**. When we have more than two groups (e.g., comparing mean blood pressure across three different age groups), using multiple t-tests increases the "Type I Error" (false positive rate). ANOVA solves this by comparing the variance between groups and within groups simultaneously. **Why the other options are incorrect:** * **Unpaired (Independent) Student’s t-test:** This is used to compare the means of exactly **two independent groups** (e.g., comparing mean hemoglobin levels between males and females). It cannot be used for more than two groups. * **Paired Student’s t-test:** This is used to compare means of the **same group at two different times** (e.g., pre-treatment vs. post-treatment blood sugar levels in the same patients). It is for "before and after" scenarios, not multiple different groups. **High-Yield Clinical Pearls for NEET-PG:** * **Parametric vs. Non-Parametric:** Remember that ANOVA and t-tests assume a **Normal (Gaussian) Distribution**. * **Non-Parametric Equivalents:** If the data is not normally distributed, use the **Kruskal-Wallis Test** instead of ANOVA, and the **Mann-Whitney U Test** instead of the Unpaired t-test. * **Z-test:** Used instead of a t-test when the sample size is large (**n > 30**). * **Chi-square Test:** Used for comparing **proportions/qualitative data** (e.g., number of smokers vs. non-smokers), not means.

Q: What proportion of the area under a normal distribution curve lies between the mean and +1 standard deviation?

0.34. ### Explanation **1. Why the Correct Answer is Right:** The Normal (Gaussian) Distribution is a symmetrical, bell-shaped curve defined by its mean ($\mu$) and standard deviation ($\sigma$). According to the **Empirical Rule** (also known as the 68-95-99.7 rule): * Approximately **68.2%** of the total area lies within **$\pm$1 SD** of the mean (from -1 SD to +1 SD). * Since the normal distribution is perfectly symmetrical, the area is divided equally on both sides of the mean. * Therefore, the area between the mean and +1 SD is exactly half of 68%, which is **34% (or 0.34)**. **2. Analysis of Incorrect Options:** * **Option A (0.68):** This represents the *total* area between -1 SD and +1 SD. The question specifically asks for the area between the mean and *only* the positive side (+1 SD). * **Option B (0.17):** This is half of 0.34; it does not correspond to a standard landmark in the normal distribution. * **Option C (0.12):** This value is incorrect; however, the area between +1 SD and +2 SD is approximately 13.5% (0.135). **3. High-Yield Clinical Pearls for NEET-PG:** * **68-95-99.7 Rule:** * Mean $\pm$ 1 SD = 68.2% * Mean $\pm$ 2 SD = 95.4% * Mean $\pm$ 3 SD = 99.7% * **Z-score:** A Z-score of +1 indicates the value is 1 SD above the mean. * **Symmetry:** In a normal distribution, **Mean = Median = Mode**. * **Total Area:** The total area under the curve is always **1 (or 100%)**. * **Limits:** The curve is asymptotic, meaning the tails approach but never touch the horizontal axis.

Q: What is the best graphic representation for the frequency distribution of data gathered from a continuous variable?

Histogram. ### Explanation **Why Histogram is the Correct Answer:** A **Histogram** is the most appropriate graphic representation for a **continuous variable** (e.g., height, weight, blood pressure, or age). In a histogram, the data is divided into continuous class intervals (X-axis) and the frequency is represented by the area of the rectangles (Y-axis). Because the data is continuous, there are **no gaps** between the bars, signifying that the variable can take any value between the intervals. **Analysis of Incorrect Options:** * **A & B. Simple and Multiple Bar Graphs:** These are used for **discrete (categorical) data** or qualitative variables (e.g., number of hospital beds, gender, or types of vaccines). Unlike histograms, bar graphs have spaces between the bars to indicate that the categories are distinct and not continuous. * **C. Line Diagram:** These are primarily used to show **trends over time** (time-series data), such as the incidence of Malaria in a city over 10 years. While a "Frequency Polygon" (a type of line graph) can represent continuous data, a standard line diagram is not the primary choice for frequency distribution. **High-Yield Clinical Pearls for NEET-PG:** * **Frequency Polygon:** Created by joining the midpoints of the tops of the bars in a histogram. It is useful for comparing two or more frequency distributions on the same graph. * **Scatter Diagram:** Used to show the **correlation** (relationship) between two continuous variables (e.g., height and weight). * **Ogives:** Also known as cumulative frequency graphs; used to determine the **median** of a distribution. * **Pie Chart:** Best for showing the relative proportion of various components of a whole (total must be 100%).

Q: The mean weight of a group of 10 boys was calculated to be 18.2 kg. It was later discovered that the weight of one boy was incorrectly recorded as 2.0 kg instead of the correct value of 20 kg. What is the true mean weight of the group?

20.0 kg. ### Explanation **Concept:** The **Arithmetic Mean** is the sum of all observations divided by the total number of observations ($n$). Because the mean is calculated using every value in a dataset, it is highly sensitive to outliers and errors in data entry. To find the true mean after a recording error, we must adjust the total sum of the values. **Step-by-Step Calculation:** 1. **Find the incorrect sum:** $Mean \times n = 18.2 \times 10 = 182\text{ kg}$. 2. **Calculate the difference:** The error was $20\text{ kg}$ (correct) vs $2.0\text{ kg}$ (incorrect). Difference = $+18\text{ kg}$. 3. **Find the correct sum:** $182 + 18 = 200\text{ kg}$. 4. **Calculate the true mean:** $200 / 10 = \mathbf{20.0\text{ kg}}$. **Analysis of Options:** * **Option D (Correct):** As calculated above, correcting the $18\text{ kg}$ deficit across 10 individuals adds exactly $1.8\text{ kg}$ to the initial mean ($18.2 + 1.8 = 20.0$). * **Option A:** This is the original, incorrect mean. It fails to account for the data entry error. * **Option B:** This would be the result if the error was $20\text{ kg}$ instead of $2\text{ kg}$ (adding $2.0$ to the mean), representing a calculation oversight. * **Option C:** This would occur if the values were swapped (recording $20$ instead of $2$), leading to a decrease in the mean. **High-Yield Clinical Pearls for NEET-PG:** * **Sensitivity:** The Mean is the only measure of central tendency that uses every value in the distribution; hence, it is the most affected by extreme values (outliers). * **Skewness:** In a **positively skewed** distribution (e.g., income), Mean > Median > Mode. In a **negatively skewed** distribution, Mean < Median < Mode. * **Best Measure:** For normally distributed (symmetrical) data, the **Mean** is the best measure of central tendency. For skewed data (like incubation periods), the **Median** is preferred.

Question 1

The histogram is used as a method of group presentation for which type of data?

Accepted Answer

Quantitative continuous data

Answer

Qualitative data

Answer

Quantitative data - discrete type

Answer

Nominal data

Question 2

Intraocular pressure (IOP) was measured in 400 people. The mean IOP was found to be 25 mm Hg and the standard deviation was recorded as 10 mm Hg. What is the 95% Confidence Interval for the mean IOP?

Accepted Answer

24-26 mm Hg

Answer

22-28 mm Hg

Answer

23-27 mm Hg

Answer

21-29 mm Hg

Question 3

What does random error primarily reduce?

Accepted Answer

Precision

Answer

Systemic error

Answer

Bias

Answer

Confounding factor

Question 4

The number of live births per 1000 women in the reproductive age group in a year refers to:

Accepted Answer

General fertility rate

Answer

Total fertility rate

Answer

Gross reproduction rate

Answer

Net reproduction rate

Question 5

In a negatively skewed distribution, which statement is true?

Accepted Answer

Mean is less than the median

Answer

Mean is greater than the median

Answer

Mean is equal to the mode

Answer

Mode is less than the median

Question 6

Which of the following parametric tests can be used for comparing means in more than two different groups of individuals?

Accepted Answer

ANOVA test

Answer

Unpaired student's t-test

Answer

Paired student's t-test

Answer

All of the above

Question 7

What proportion of the area under a normal distribution curve lies between the mean and +1 standard deviation?

Accepted Answer

0.34

Answer

0.68

Answer

0.17

Answer

0.12

Question 8

What is the denominator for the maternal mortality rate?

Accepted Answer

100,000 live births

Answer

100,000 pregnancies

Answer

100,000 births

Answer

100,000 population

Question 9

What is the best graphic representation for the frequency distribution of data gathered from a continuous variable?

Accepted Answer

Histogram

Answer

Simple bar graph

Answer

Multiple bar graph

Answer

Line diagram

Question 10

The mean weight of a group of 10 boys was calculated to be 18.2 kg. It was later discovered that the weight of one boy was incorrectly recorded as 2.0 kg instead of the correct value of 20 kg. What is the true mean weight of the group?

Accepted Answer

20.0 kg

Answer

18.2 kg

Answer

20.2 kg

Answer

16.4 kg

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?