Biostatistics Practice Questions

Q: Which of the following graphical methods is best for studying the decline in the percentage of alcohol usage in men and women over several years?

Line diagram. ### Explanation **1. Why the Correct Answer is Right:** The **Line Diagram** (or Line Graph) is the most suitable method for representing **trends over time**. In biostatistics, when we need to observe the progression, decline, or fluctuation of a variable (like alcohol usage percentage) across several years, a line diagram effectively connects data points to show the direction of change. It is particularly useful for comparing two or more groups (men vs. women) on the same axes, allowing for a clear visual comparison of their respective trends. **2. Why Other Options are Incorrect:** * **Pie Chart:** This is used to show the **proportional distribution** of a single variable at a specific point in time (e.g., the share of different types of substances used). It cannot represent changes over a time series. * **Histogram:** This is used for **continuous quantitative data** to show frequency distribution within a single group. It consists of adjacent rectangles where the area represents the frequency. It is not designed to show trends over years. * **Frequency Polygon:** This is a derivative of the histogram, created by joining the midpoints of the tops of the histogram bars. While it shows distribution, it is used for frequency data, not for tracking a percentage trend over a chronological period. **3. Clinical Pearls & High-Yield Facts for NEET-PG:** * **Trend over time:** Always choose **Line Diagram**. * **Correlation between two variables:** Always choose **Scatter Diagram**. * **Comparison of discrete/qualitative data:** Use a **Bar Chart**. * **Frequency distribution of continuous data:** Use a **Histogram**. * **To find Median graphically:** Use an **Ogive** (Cumulative frequency curve). * **Pictogram:** Uses images to represent data; easiest for a layperson to understand but least accurate.

Q: Which of the following is NOT a probability sampling method?

Quota sampling. ### Explanation In biostatistics, sampling methods are broadly categorized into **Probability** (Random) and **Non-probability** (Non-random) sampling. **Why Quota Sampling is the Correct Answer:** Quota sampling is a **non-probability sampling method**. In this technique, the population is divided into strata (e.g., age, gender), and the researcher is assigned a specific "quota" to fill from each group. However, unlike stratified random sampling, the selection of individuals within these quotas is done through **convenience or judgment** rather than random selection. Because every member of the population does not have a known, non-zero chance of being selected, it is not a probability method. **Analysis of Incorrect Options:** * **A. Simple Random Sampling:** The "gold standard" of probability sampling where every individual has an equal and independent chance of being selected (e.g., using a lottery method or random number table). * **B. Systematic Random Sampling:** A probability method where the first unit is selected randomly, and subsequent units are chosen at fixed intervals (every $k^{th}$ unit). It is often used in OPD settings. * **C. Cluster Sampling:** A probability method where the population is divided into "clusters" (e.g., villages or wards), and entire clusters are randomly selected. This is the method used by the WHO for Expanded Programme on Immunization (EPI) coverage surveys (30 x 7 cluster technique). **High-Yield Clinical Pearls for NEET-PG:** * **Non-probability methods** include: Quota, Convenience (Accidental), Purposive (Judgmental), and Snowball sampling. * **Snowball sampling** is the method of choice for "hidden populations" (e.g., IV drug users, commercial sex workers). * **Multistage sampling** is the most commonly used method in large-scale national health surveys in India (like NFHS). * **Sampling Error** occurs only in probability sampling; non-probability sampling is prone to **Selection Bias**.

Q: Which variation occurs due to different diseases being treated in different hospitals?

Berksonian bias. **Explanation:** **Berksonian Bias (Admission Rate Bias)** occurs when a study is conducted using hospital-based populations rather than the general community. It arises because patients with multiple diseases (comorbidities) are more likely to be admitted to a hospital than those with only one. If different hospitals specialize in different diseases, the association between an exposure and a disease may be artificially distorted (either strengthened or weakened) because the "hospitalized" sample does not represent the true distribution in the general population. **Analysis of Incorrect Options:** * **Neyman Bias (Prevalence-Incidence Bias):** This occurs when there is a gap between the onset of a disease and the selection of study subjects. It typically excludes patients who die early or recover quickly, leading to a sample of "survivors" (prevalent cases) rather than all incident cases. * **Attention Bias (Hawthorne Effect):** This is a change in behavior by the study participants because they are aware they are being observed or studied. * **Recall Bias:** Common in case-control studies, this occurs when cases remember past exposures more accurately or differently than healthy controls. **Clinical Pearls for NEET-PG:** * **Berksonian Bias** is a type of **Selection Bias**. * To minimize this bias, researchers should ideally use **community-based samples** rather than hospital-based samples. * **Key Trigger Words:** "Hospital-based study," "Different admission rates," or "Multiple comorbidities" usually point toward Berksonian bias. * **Neyman Bias** is most commonly associated with **Cross-sectional studies** involving chronic diseases.

Q: If the mean, median, and mode are 10, 18, and 26 respectively, what kind of distribution is it?

Negatively skewed. ### Explanation **1. Why the Correct Answer is Right:** In biostatistics, the relationship between the **Mean, Median, and Mode** determines the shape of a frequency distribution curve. * In this question: **Mean (10) < Median (18) < Mode (26).** * When the mean is pulled toward the lower values (the left side), it indicates a **Negatively Skewed Distribution** (also known as "Left-skewed"). * The "tail" of the graph points toward the smaller numbers (negative side of the X-axis). This happens when there are a few extremely low values that drag the mean down, while the majority of data points are clustered at the higher end. **2. Why the Incorrect Options are Wrong:** * **Symmetric / Normal Distribution (Options A & B):** In a perfectly symmetrical or Normal (Gaussian) distribution, the **Mean = Median = Mode**. Since 10 ≠ 18 ≠ 26, the distribution is asymmetrical. * **Positively Skewed (Option C):** In a positively skewed distribution (Right-skewed), the relationship is reversed: **Mean > Median > Mode**. The tail points toward the higher values (positive side), usually due to a few extremely high outliers. **3. NEET-PG Clinical Pearls & High-Yield Facts:** * **The "Alphabetical Rule":** To remember the order in a **Positively Skewed** distribution, follow the alphabet: **M**ean > **M**edian > **M**ode (alphabetical order of the second letters: e > i > o). * **Median's Position:** In any skewed distribution (positive or negative), the **Median always stays in the middle** between the Mean and the Mode. * **Best Measure of Central Tendency:** * For **Normal Distribution**: Mean is the best measure. * For **Skewed Distribution**: Median is the best measure (as it is not affected by extreme outliers). * **Formula (Karl Pearson’s):** $Mode = (3 \times Median) - (2 \times Mean)$. This is often used to calculate a missing value in NEET-PG numericals.

Q: How can the power of a study be increased?

Decreasing beta error. ### Explanation In biostatistics, the **Power of a Study** is defined as the probability that the study will correctly reject a null hypothesis when it is false (i.e., the ability to detect a true difference or effect). **1. Why the Correct Answer is Right:** Mathematically, **Power = 1 – β (Beta error)**. * **Beta (Type II) error** occurs when a researcher fails to reject a null hypothesis that is actually false (a "false negative"). * Since Power and Beta error are inversely related, **decreasing the Beta error** directly increases the Power of the study. A study with a Beta error of 0.20 (20%) has a Power of 0.80 (80%). **2. Why the Incorrect Options are Wrong:** * **Option A & C (Alpha Error):** Alpha (Type I) error is the probability of rejecting a null hypothesis when it is actually true (a "false positive"). While decreasing alpha makes a study more stringent, it actually *increases* the risk of a Type II error, thereby potentially *decreasing* power. * **Option D (Increasing Beta Error):** Increasing the Beta error means the study is more likely to miss a true effect, which mathematically reduces the Power. **3. NEET-PG High-Yield Clinical Pearls:** * **Sample Size:** The most common practical way to increase the power of a study in clinical research is to **increase the sample size**. * **Standard Power:** In most medical research, a power of **80% (0.8)** is considered the minimum acceptable level. * **Determinants of Power:** Power is influenced by the sample size, the effect size (magnitude of difference), the significance level (alpha), and the variance (standard deviation) in the data. * **Type I Error (α):** "Finding a difference when none exists." * **Type II Error (β):** "Missing a difference that actually exists."

Q: What is a measure of dispersion?

Standard deviation. **Explanation:** In biostatistics, data is summarized using two primary types of measures: **Measures of Central Tendency** (averages) and **Measures of Dispersion** (variability). **Why Standard Deviation is Correct:** **Standard Deviation (SD)** is the most commonly used measure of dispersion in medical research. It quantifies the extent of variation or "spread" of data points around the arithmetic mean. A low SD indicates that the data points tend to be close to the mean, while a high SD indicates that the data is spread out over a wider range of values. It is the square root of the variance and is expressed in the same units as the original data. **Why the Other Options are Incorrect:** * **Mean (A):** This is the arithmetic average of all observations. It is a measure of central tendency, not dispersion. * **Mode (B):** This is the value that occurs most frequently in a dataset. It is a measure of central tendency used primarily for nominal data. * **Median (D):** This is the middle-most value when data is arranged in ascending or descending order. It is a measure of central tendency used especially when data is skewed. **High-Yield Facts for NEET-PG:** * **Measures of Dispersion include:** Range, Mean Deviation, Standard Deviation, and Coefficient of Variation. * **Relative Measure of Dispersion:** The **Coefficient of Variation** is used to compare the variability of two different series (e.g., comparing height in cm vs. weight in kg). * **Normal Distribution (Gaussian Curve):** In a normal distribution, Mean = Median = Mode. * **The 68-95-99.7 Rule:** In a normal distribution, Mean ± 1 SD covers 68% of values, Mean ± 2 SD covers 95%, and Mean ± 3 SD covers 99.7%.

Q: Standard error of the mean indicates what?

Deviation. **Explanation:** **Standard Error of the Mean (SEM)** is a measure of the **deviation** of the sample mean from the true population mean. While Standard Deviation (SD) measures the spread of individual observations within a single sample, SEM measures the precision of the sample mean as an estimate of the population mean. 1. **Why "Deviation" is correct:** SEM is mathematically defined as the standard deviation of the sampling distribution of the mean. It quantifies how much the sample mean is likely to "deviate" from the actual population mean. A smaller SEM indicates that the sample mean is a more accurate reflection of the population mean. 2. **Why other options are incorrect:** * **Dispersion & Variation:** These are broad terms describing the spread of data. While SEM is a type of dispersion, these terms usually refer to **Standard Deviation (SD)** or **Variance**, which describe the spread of individual data points around their own mean, rather than the reliability of the mean itself. * **Distribution:** This refers to the overall pattern or shape of the data (e.g., Normal/Gaussian distribution) rather than a specific numerical measure of error or precision. **High-Yield NEET-PG Pearls:** * **Formula:** $SEM = \frac{SD}{\sqrt{n}}$ (where $n$ is the sample size). * **Relationship:** As the sample size ($n$) increases, the SEM decreases, making the estimate more precise. * **Clinical Application:** SEM is primarily used to calculate **Confidence Intervals (CI)**. * **Key Distinction:** Use **SD** to describe the variability of a biological characteristic (e.g., BP in a group); use **SEM** to describe the uncertainty of the mean estimate in a study.

Q: Child survival index is calculated by?

1000-U5MR/10. ### Explanation **1. Why the Correct Answer is Right:** The **Child Survival Index (CSI)** is a health indicator used to represent the probability of a child surviving until their fifth birthday. It is derived from the **Under-5 Mortality Rate (U5MR)**, which is the number of deaths per 1,000 live births before age five. The formula is: $$\text{Child Survival Index} = \frac{1000 - \text{U5MR}}{10}$$ * **Logic:** Subtracting U5MR from 1,000 gives the number of survivors out of 1,000 live births. Dividing by 10 converts this figure into a **percentage (%)**, making it a standardized index for comparing regional health performance. **2. Why Incorrect Options are Wrong:** * **Options A & B:** These use the **Infant Mortality Rate (IMR)**. While IMR (deaths before age 1) is a sensitive indicator of socio-economic development, the Child Survival Index specifically focuses on the "Under-5" milestone, which reflects broader factors like nutrition and immunization. * **Option D:** This formula results in a negative number (e.g., $50 - 1000 = -950$), which is mathematically incorrect for calculating a survival index. **3. High-Yield Facts for NEET-PG:** * **U5MR vs. IMR:** U5MR is considered the best single indicator of social development and well-being rather than just health status. * **Child Survival Index:** It was a key metric used in the **Child Survival and Safe Motherhood (CSSM)** program launched in India (1992). * **Indicator of Choice:** For monitoring the progress of Millennium Development Goals (MDGs) and now Sustainable Development Goals (SDGs), U5MR is the preferred indicator. * **Current Trend:** As of recent NFHS data, India has seen a significant decline in U5MR, though regional disparities remain.

Question 1

Which of the following statements about Simple Random Sampling is true?

Accepted Answer

Every element in the population has an equal probability of being included.

Answer

Sampling is based on similar characteristics.

Answer

Suitable for large heterogeneous population.

Answer

A complete list of items within the sampling frame is not required.

Question 2

Which of the following graphical methods is best for studying the decline in the percentage of alcohol usage in men and women over several years?

Accepted Answer

Line diagram

Answer

Pie chart

Answer

Histogram

Answer

Frequency polygon

Question 3

Which of the following is NOT a probability sampling method?

Accepted Answer

Quota sampling

Answer

Simple random sampling

Answer

Systematic random sampling

Answer

Cluster sampling

Question 4

Which of the following statements is true regarding Positive Predictive Value (PPV)?

Accepted Answer

PPV is directly proportional to prevalence.

Answer

PPV is inversely proportional to prevalence.

Answer

PPV has no relation with prevalence.

Answer

PPV doubles with a decrease in prevalence.

Question 5

Which variation occurs due to different diseases being treated in different hospitals?

Accepted Answer

Berksonian bias

Answer

Neyman bias

Answer

Attention bias

Answer

Recall bias

Question 6

If the mean, median, and mode are 10, 18, and 26 respectively, what kind of distribution is it?

Accepted Answer

Negatively skewed

Answer

Symmetric

Answer

Normal

Answer

Positively skewed

Question 7

How can the power of a study be increased?

Accepted Answer

Decreasing beta error

Answer

Increasing alpha error

Answer

Decreasing alpha error

Answer

Increasing beta error

Question 8

What is a measure of dispersion?

Accepted Answer

Standard deviation

Answer

Mean

Answer

Mode

Answer

Median

Question 9

Standard error of the mean indicates what?

Accepted Answer

Deviation

Answer

Dispersion

Answer

Distribution

Answer

Variation

Question 10

Child survival index is calculated by?

Accepted Answer

1000-U5MR/10

Answer

1000-IMR/100

Answer

IMR-1000/10

Answer

U5MR-1000/10

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?