Biostatistics Practice Questions

Q: What is the formula for the standard error of the mean in a population?

SD/√n. **Explanation:** **1. Why Option C is Correct:** The **Standard Error of the Mean (SEM)** measures the precision of the sample mean as an estimate of the population mean. It quantifies how much the sample mean is likely to fluctuate from the true population mean if the study were repeated multiple times. Mathematically, it is calculated by dividing the Standard Deviation (SD) by the square root of the sample size ($n$). **Formula:** $SEM = \frac{SD}{\sqrt{n}}$ As the sample size ($n$) increases, the SEM decreases, indicating that larger samples provide a more accurate estimate of the population mean. **2. Why Other Options are Incorrect:** * **Option A (SD/n):** This is a common distractor. Dividing by $n$ instead of $\sqrt{n}$ would drastically underestimate the error for small samples and is not a recognized statistical parameter. * **Option B (SD/mean):** This is the formula for the **Coefficient of Variation (CV)**, which expresses the SD as a percentage of the mean to compare variability between different datasets. * **Option D (Mean/SD):** This is the inverse of the Coefficient of Variation and has no specific application in standard biostatistical reporting. **3. High-Yield Clinical Pearls for NEET-PG:** * **SD vs. SEM:** Use **SD** to describe the spread/variability of individual data points within a single sample. Use **SEM** to describe the uncertainty or "play of chance" in the mean itself. * **Confidence Intervals (CI):** SEM is used to calculate the 95% CI ($Mean \pm 2 \times SEM$). * **Relationship:** SEM is always smaller than the SD. * **Sample Size:** To reduce the SEM by half, the sample size must be increased fourfold (due to the square root).

Q: Within how many days should a birth be registered?

21 days. **Explanation:** The registration of vital events (births and deaths) in India is governed by the **Registration of Births and Deaths (RBD) Act, 1969**. According to this Act, the uniform time limit for the registration of births, deaths, and stillbirths is **21 days** from the date of the event. * **Why 21 days is correct:** The Central Government mandated this 21-day window to ensure uniformity across all states. Registration within this period is free of charge. This data is crucial for calculating the Birth Rate and monitoring population dynamics. **Analysis of Incorrect Options:** * **7 days & 14 days:** These were historical time limits used in certain states before the 1969 Act was fully implemented and standardized. They are no longer the legal standard in India. * **30 days:** While registration can occur after 21 days, it is considered "delayed registration." Registration between 21 to 30 days requires a late fee and a self-declaration. **High-Yield Clinical Pearls for NEET-PG:** * **The RBD Act, 1969:** Came into force on April 1, 1970. * **Hierarchy:** The Registrar General of India (RGI) operates at the central level, while the Chief Registrar of Births and Deaths operates at the state level. * **Delayed Registration Rules:** * **21–30 days:** Late fee + prescribed form. * **30 days to 1 year:** Written permission from the prescribed authority + late fee + affidavit. * **>1 year:** Order from a First Class Magistrate + late fee. * **Stillbirths:** The registration period for stillbirths is also **21 days**.

Q: What is simple random sampling?

Haphazard collection of a certain number for a sample.. **Explanation:** **Simple Random Sampling (SRS)** is the most basic form of probability sampling where every individual in the population (sampling frame) has an **equal and independent chance** of being selected. In the context of the options provided, "haphazard collection" refers to the lack of a predetermined pattern or systematic bias, ensuring that the selection is governed purely by chance (e.g., using a lottery method or random number table). **Analysis of Options:** * **Option B (Correct):** It describes the essence of randomness—selecting a sample without any specific order or preference, ensuring that the selection of one individual does not influence the selection of another. * **Option A:** Incorrect. SRS actually allows for a very large number of possible sample combinations compared to restricted sampling methods. * **Option C:** Incorrect. This describes **Systematic Random Sampling**, where subjects are picked at a fixed "sampling interval" (kth unit). * **Option D:** Incorrect. This describes **Stratified Random Sampling**, where the population is divided into homogenous groups (strata) before sampling to ensure representation. **NEET-PG High-Yield Pearls:** * **Gold Standard:** SRS is the ideal method if a complete list of the population (sampling frame) is available. * **Methods of SRS:** Lottery method, Tippett’s random number table, or computer-generated random numbers. * **Requirement:** The population must be **homogenous** for SRS to be truly representative. If the population is heterogenous, Stratified Sampling is preferred. * **Sampling Bias:** SRS is the best way to eliminate selection bias in clinical trials and epidemiological studies.

Q: The regression between height and age follows the equation y = a + bx. What type of curve does this represent?

Straight line. ### Explanation **1. Why "Straight Line" is Correct:** The equation **y = a + bx** is the standard mathematical formula for a **Simple Linear Regression**. * **y** is the dependent variable (e.g., Height). * **x** is the independent variable (e.g., Age). * **a** is the intercept (the value of y when x is zero). * **b** is the regression coefficient (the slope of the line). In biostatistics, linear regression is used to predict the value of one continuous variable based on another. Because the power of the variable 'x' is 1 (first-degree equation), the relationship plotted on a Cartesian plane results in a **straight line**. **2. Why Other Options are Incorrect:** * **A. Hyperbola:** This represents an inverse relationship (y = 1/x) where one variable increases as the other decreases at a non-linear rate. * **B. Sigmoid:** This is an S-shaped curve common in **Logistic Regression**, used when the dependent variable is binary (e.g., Dead/Alive or Diseased/Healthy). * **C. Parabola:** This represents a quadratic relationship ($y = ax^2 + bx + c$), where the direction of the curve changes once. **3. Clinical Pearls & High-Yield Facts for NEET-PG:** * **Correlation vs. Regression:** Correlation ($r$) measures the *strength and direction* of a relationship, while Regression ($b$) allows for the *prediction* of one variable from another. * **Coefficient of Determination ($r^2$):** This indicates the proportion of variance in the dependent variable that is predictable from the independent variable. * **Range of $r$:** Correlation coefficient ranges from -1 to +1, whereas the regression coefficient ($b$) can range from $-\infty$ to $+\infty$. * **Scatter Diagram:** This is the best visual method to initially assess the relationship between two quantitative variables before calculating regression.

Q: Dependency ratio includes which of the following age groups?

Less than 15 years. **Explanation:** The **Dependency Ratio** is a demographic indicator used to measure the economic burden on the productive portion of a population. It is defined as the ratio of the "dependent" population (those who are generally not in the labor force) to the "productive" population (those who support them). **1. Why Option A is Correct:** In biostatistics and demography, the population is divided into three functional age groups: * **Young Dependents:** 0–14 years (Less than 15 years). * **Productive Age Group:** 15–64 years. * **Old Dependents:** 65 years and above. Since the question asks which age group is *included* in the dependency ratio, **Option A (<15 years)** is correct as it represents the young dependency component. **2. Why Other Options are Incorrect:** * **Option B (<85 years):** This is too broad and includes the productive age group (15–64), which is the denominator, not the dependent numerator. * **Option C (30–50 years):** This group falls entirely within the "economically active" or productive age bracket (15–64 years). **3. NEET-PG High-Yield Pearls:** * **Formula:** $\text{Dependency Ratio} = \frac{(\text{Population } 0-14) + (\text{Population } 65+)}{\text{Population } 15-64} \times 100$ * **Total Dependency Ratio:** Sum of Young Dependency Ratio + Old Dependency Ratio. * **Demographic Dividend:** Occurs when the dependency ratio declines due to a bulge in the working-age population (15–64 years). * **Note:** In the Indian context, some older texts may use 0–14 and 60+ as dependents, but the international standard (WHO/UN) used in most exams is 0–14 and 65+.

Q: In a positively skewed distribution, what is the relationship between the mean, median, and mode?

Mean > Median > Mode. ### Explanation In biostatistics, the relationship between measures of central tendency (mean, median, and mode) depends entirely on the symmetry of the frequency distribution. **1. Why Option B is Correct (Mean > Median > Mode):** A **positively skewed distribution** (also known as right-skewed) is characterized by a long tail extending toward the higher values on the right side of the horizontal axis. * **The Mean** is highly sensitive to extreme values (outliers). In a positive skew, the few very high values pull the mean toward the right. * **The Mode** remains at the peak of the curve (the most frequent value). * **The Median** falls in between, as it is the middle-most value and is less affected by outliers than the mean. Therefore, the mathematical relationship is always **Mean > Median > Mode**. **2. Why Other Options are Incorrect:** * **Option A (Mean = Median = Mode):** This occurs only in a **Normal (Gaussian) Distribution**, which is perfectly symmetrical and bell-shaped. * **Option C (Mode > Median > Mean):** This describes a **Negatively Skewed Distribution** (left-skewed), where the tail extends toward the lower values, pulling the mean down. **3. High-Yield Clinical Pearls for NEET-PG:** * **Memory Aid:** In a **P**ositive skew, the Mean is **P**ulled toward the tail (the higher side). * **Best Measure of Central Tendency:** * For skewed data: **Median** (it is "robust" against outliers). * For nominal data: **Mode**. * For normally distributed data: **Mean**. * **Standard Deviation:** In any skewed distribution, the standard deviation is not an ideal measure of dispersion; the **Interquartile Range (IQR)** is preferred.

Q: True positivity is indicated by which of the following measures?

Sensitivity. **Explanation:** **Sensitivity** is defined as the ability of a test to correctly identify those who have the disease. It represents the **True Positive Rate**. Mathematically, it is calculated as: `Sensitivity = [True Positives (TP) / (True Positives + False Negatives)] × 100`. In clinical practice, a highly sensitive test is used for screening because it ensures that very few cases are missed (low false negatives). **Analysis of Incorrect Options:** * **B. Specificity:** This measures the **True Negativity**. It is the ability of a test to correctly identify those without the disease. It is calculated as `[True Negatives (TN) / (True Negatives + False Positives)]`. * **C. Predictive Value:** This refers to the probability that a person with a positive test result actually has the disease (Positive Predictive Value) or a person with a negative result is truly healthy (Negative Predictive Value). It depends heavily on the **prevalence** of the disease in the population. * **D. Validity:** This is a broader term indicating the accuracy of a test—the degree to which a test measures what it intends to measure. It encompasses both sensitivity and specificity. **High-Yield Pearls for NEET-PG:** * **SNOUT:** **S**ensitivity rules **OUT** disease (when the result is negative). * **SPIN:** **S**pecificity rules **IN** disease (when the result is positive). * Sensitivity and Specificity are **inherent properties** of a test and do not change with disease prevalence. * In contrast, **Predictive Values** are inversely or directly proportional to prevalence (PPV increases with prevalence; NPV decreases).

Q: To test the association between a risk factor and a disease, which of the following is considered the weakest study design?

Ecological study. ### Explanation In biostatistics and epidemiology, the strength of a study design is determined by its ability to establish a causal relationship between an exposure and an outcome. **Why Ecological Study is the Correct Answer:** An **Ecological study** is considered the weakest among the given options because it uses **aggregate data** (populations or groups) rather than individual data. Because the exposure and outcome are not linked at the individual level, it is impossible to confirm if the individuals who developed the disease were the same ones exposed to the risk factor. This leads to the **"Ecological Fallacy"**—the error of making inferences about individuals based on group data. **Analysis of Incorrect Options:** * **Cohort Study (B):** This is the strongest observational design. It starts with exposed and non-exposed individuals and follows them forward in time to calculate **Relative Risk**. * **Case–Control Study (A):** Stronger than ecological studies because it compares individuals with the disease (cases) to those without (controls) to determine past exposure, calculating the **Odds Ratio**. * **Cross-sectional Study (D):** While it only provides a "snapshot" of prevalence and cannot establish temporal sequence, it still uses **individual-level data**, making it more robust than an ecological study for suggesting associations. **NEET-PG High-Yield Pearls:** * **Hierarchy of Evidence (Descending order):** Meta-analysis > Systematic Review > RCT > Cohort > Case-Control > Cross-sectional > Ecological > Case series/Report. * **Unit of Study:** In Ecological studies, the unit is a **Population/Country** (e.g., correlating per capita fat consumption with breast cancer rates across different nations). * **Ecological Fallacy:** Also known as the "Aggregation bias."

Q: If the prevalence of blindness in a population is 0.005 and the prevalence of deafness is 0.001, what is the total prevalence of deafness and blindness combined?

0.000005. ### Explanation **1. Understanding the Correct Answer (Option B: 0.000005)** In biostatistics, when we calculate the prevalence of two conditions occurring **simultaneously** in the same individual (e.g., a person being both deaf AND blind), we apply the **Multiplication Rule of Probability**. Assuming that deafness and blindness are independent events in this population, the probability of both occurring together is the product of their individual prevalences: * **Formula:** $P(A \text{ and } B) = P(A) \times P(B)$ * **Calculation:** $0.005 \times 0.001 = 0.000005$ This represents the "combined" prevalence in terms of co-morbidity (dual sensory impairment). **2. Analysis of Incorrect Options** * **Option C (0.006):** This is the result of the **Addition Rule** ($0.005 + 0.001$). This would represent the prevalence of having *either* blindness *or* deafness (the total burden of either disability in the population), not the combined occurrence in a single individual. * **Option A:** This is numerically identical to Option B. In competitive exams like NEET-PG, if two options are identical and correct, it often stems from a typographical error in the question paper, but the mathematical logic remains the multiplication of the two values. **3. Clinical Pearls & High-Yield Facts** * **Independent Events:** Use the **Multiplication Rule** (Product) to find the probability of both events happening together. * **Mutually Exclusive Events:** Use the **Addition Rule** (Sum) to find the probability of either one or the other event happening. * **Prevalence vs. Incidence:** Remember that Prevalence = Incidence × Mean Duration of disease ($P = I \times D$). * **NEET-PG Tip:** Always read carefully if the question asks for "both together" (Multiplication) or "either/or" (Addition). In the context of "combined" prevalence for rare independent traits, examiners usually look for the co-occurrence rate.

Question 1

What is the formula for the standard error of the mean in a population?

Accepted Answer

SD/√n

Answer

SD/n

Answer

SD/mean

Answer

Mean/SD

Question 2

Which of the following statements regarding Meta-analysis is FALSE?

Accepted Answer

The validity of a meta-analysis does not depend on the quality of the systematic review.

Answer

Its primary purpose is not to identify risk factors.

Answer

Its purpose is to increase statistical power by increasing the sample size.

Answer

It is a statistical technique for combining the findings from several independent studies on a specific topic.

Question 3

Within how many days should a birth be registered?

Accepted Answer

21 days

Answer

7 days

Answer

14 days

Answer

30 days

Question 4

What is simple random sampling?

Accepted Answer

Haphazard collection of a certain number for a sample.

Answer

Provides the least number of possible samples.

Answer

Picking every 5th or 10th subject at regular intervals.

Answer

A sample that represents a corresponding stratum of the universe.

Question 5

The regression between height and age follows the equation y = a + bx. What type of curve does this represent?

Accepted Answer

Straight line

Answer

Hyperbola

Answer

Sigmoid

Answer

Parabola

Question 6

Dependency ratio includes which of the following age groups?

Accepted Answer

Less than 15 years

Answer

Less than 85 years

Answer

30-50 years

Answer

None of the above

Question 7

In a positively skewed distribution, what is the relationship between the mean, median, and mode?

Accepted Answer

Mean > Median > Mode

Answer

Mean = Median = Mode

Answer

Mode > Median > Mean

Answer

None of the above

Question 8

True positivity is indicated by which of the following measures?

Accepted Answer

Sensitivity

Answer

Specificity

Answer

Predictive value

Answer

Validity

Question 9

To test the association between a risk factor and a disease, which of the following is considered the weakest study design?

Accepted Answer

Ecological study

Answer

Case–control study

Answer

Cohort study

Answer

Cross-sectional study

Question 10

If the prevalence of blindness in a population is 0.005 and the prevalence of deafness is 0.001, what is the total prevalence of deafness and blindness combined?

Accepted Answer

0.000005

Answer

0.006

Answer

None of the above

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?