Biostatistics Practice Questions

Q: All of the following are true regarding increasing sample size except?

Decreases power of the test. ### Explanation The relationship between sample size and statistical parameters is a high-yield concept in biostatistics. Increasing the sample size ($n$) generally improves the precision and reliability of a study. **Why Option A is the Correct Answer (The "Except"):** Increasing the sample size **increases** the power of the test, it does not decrease it. * **Power ($1 - \beta$)** is the probability of correctly rejecting a null hypothesis when it is false (detecting a true effect). * As $n$ increases, the study becomes more sensitive to detecting even small differences between groups, thereby increasing the power. **Analysis of Incorrect Options:** * **B. Standard error of the mean (SEM) decreases:** The formula for SEM is $\sigma / \sqrt{n}$. Since $n$ is in the denominator, increasing the sample size mathematically reduces the SEM, leading to more precise estimates. * **C. Decreases the Confidence Interval (CI):** The width of a CI is determined by the SEM ($CI = Mean \pm Z \times SEM$). As SEM decreases with a larger sample size, the CI becomes narrower (more precise). * **D. Decreases alpha error:** Alpha ($\alpha$) error (Type I error) is the probability of rejecting a true null hypothesis. While $\alpha$ is usually preset (e.g., 0.05), a larger sample size reduces the overall "noise" and variability, making the results more robust and reducing the likelihood of a chance finding (false positive). **NEET-PG High-Yield Pearls:** 1. **Sample Size $\propto$ Power:** To detect a smaller effect size, you need a larger sample size. 2. **Sample Size $\propto$ 1/Precision:** Larger samples yield narrower Confidence Intervals. 3. **Type II Error ($\beta$):** Increasing sample size is the most effective way to decrease $\beta$ error. 4. **Law of Large Numbers:** As $n$ increases, the sample mean gets closer to the actual population mean.

Q: A correlation coefficient of +1 indicates which of the following?

A perfect positive correlation. **Explanation:** The **Correlation Coefficient (r)**, also known as Pearson’s ‘r’, is a statistical measure that quantifies the strength and direction of a linear relationship between two quantitative variables. The value of ‘r’ always ranges from **-1 to +1**. 1. **Why the correct answer is right:** A value of **+1** signifies a **perfect positive correlation**. This means that for every unit increase in one variable, there is a proportional increase in the other. On a scatter diagram, all data points would fall exactly on a straight line sloping upwards from left to right. 2. **Why the incorrect options are wrong:** * **Option A & D (Weak/Strong):** These terms describe values between 0 and 1. Generally, 0.1–0.3 is considered weak, 0.4–0.6 is moderate, and 0.7–0.9 is considered a strong correlation. * **Option B (Moderate):** A moderate correlation (e.g., r = +0.5) indicates a visible trend, but the data points are scattered around the regression line rather than sitting perfectly on it. **High-Yield Clinical Pearls for NEET-PG:** * **Direction:** A positive sign (+) means variables move in the same direction; a negative sign (-) means they move in opposite directions (e.g., as age increases, vital capacity decreases). * **Strength:** The closer the value is to 1 (regardless of the sign), the stronger the relationship. * **Zero Correlation (r = 0):** Indicates no linear relationship between the variables. * **Coefficient of Determination ($r^2$):** This represents the proportion of variance in one variable that is predictable from the other. If r = 0.6, then $r^2$ = 0.36 (or 36%).

Q: Mean hemoglobin values are compared between two independent population groups. Which is the most appropriate statistical test to use?

Unpaired t-test. ### Explanation **Why Option B (Unpaired t-test) is Correct:** The choice of a statistical test depends on the **type of data** and the **number of groups** being compared. 1. **Data Type:** Hemoglobin is a continuous numerical variable (Quantitative data). 2. **Groups:** The question specifies "two independent population groups" (e.g., comparing hemoglobin levels in males vs. females). The **Unpaired t-test** (also known as the Independent Student’s t-test) is specifically designed to compare the means of two independent groups to determine if there is a statistically significant difference between them. **Why Other Options are Incorrect:** * **A. Paired t-test:** Used for quantitative data when the two sets of observations are dependent or related (e.g., comparing hemoglobin levels in the *same* group of patients before and after iron supplementation). * **C. Chi-square test:** Used for **qualitative (categorical)** data to compare proportions between two or more groups (e.g., comparing the percentage of "anemic" vs. "non-anemic" individuals). * **D. Fisher's exact test:** A variation of the Chi-square test used for qualitative data when the sample size is very small (specifically when any expected cell frequency in a 2x2 table is 30**) and a **T-test** if the sample size is small (**n < 30**). * **ANOVA (F-test):** Use this when comparing the means of **three or more** independent groups (e.g., comparing hemoglobin in three different socioeconomic classes). * **Correlation (r):** Used to study the *strength of relationship* between two quantitative variables, not to compare means.

Q: In the calculation of crude death rate, which point in the year is the population typically considered?

July 1st. ### Explanation **Correct Answer: B. July 1st** In biostatistics and demography, the **Crude Death Rate (CDR)** is defined as the number of deaths per 1,000 population in a given year. The denominator used for this calculation is the **Mid-Year Population**. **Why July 1st?** The population of any region is dynamic, changing daily due to births, deaths, and migration. To represent the average population exposed to the risk of death throughout the entire year, we use the population as it stands on **July 1st** (the exact midpoint of the calendar year). This "Mid-Year Population" acts as an estimate of the average person-years lived by the population during that year. **Analysis of Incorrect Options:** * **A. March 1st:** In India, the National Census (conducted every 10 years) traditionally uses March 1st as the reference date for enumeration. However, for annual vital statistics like CDR, the mid-year estimate is preferred. * **C. April 1st:** This marks the beginning of the financial year in India but holds no specific statistical significance for calculating demographic rates. * **D. August 15th:** While significant as India’s Independence Day, it is not a standard reference point for demographic data. **High-Yield Clinical Pearls for NEET-PG:** * **Mid-Year Population** is the standard denominator for most annual vital rates, including Crude Birth Rate (CBR) and General Fertility Rate (GFR). * **Crude Death Rate** is "crude" because it does not account for the age and sex composition of the population. * **Age-Specific Death Rate** is considered a better indicator of the health status of a specific cohort. * **Standardized Death Rate** is the best tool for comparing mortality between two different populations (e.g., two different states or countries) as it eliminates the bias of age distribution.

Q: If rapidly progressive cancers are missed by a screening test, which type of bias will occur?

Length bias. ### Explanation **Length Bias (Length-time Bias)** occurs because screening tests are inherently more likely to detect slow-growing, indolent tumors because they have a longer "pre-clinical asymptomatic period." Conversely, **rapidly progressive cancers** have a very short window between being detectable by a test and becoming symptomatic. Consequently, these aggressive cases are often missed by periodic screening and present as "interval cancers" between screening rounds. This creates a false impression that the screening program is more effective than it actually is, as it disproportionately identifies patients with a better prognosis. **Analysis of Incorrect Options:** * **Lead-time Bias:** This is an illusion of increased survival time. It occurs when a disease is detected earlier (at the screening stage) than it would have been without screening, but the actual time of death remains unchanged. The patient simply lives longer with the *knowledge* of the disease. * **Selection Bias:** This occurs when the group of people who volunteer for screening (the "worried well") are healthier or more health-conscious than the general population, skewing the results. * **Surveillance Bias (Detection Bias):** This occurs when one group is monitored more closely than another, leading to an increased probability that a condition will be diagnosed in that group. **High-Yield Clinical Pearls for NEET-PG:** * **Length Bias** relates to the **nature/velocity** of the disease (slow vs. fast). * **Lead-time Bias** relates to the **timing** of the diagnosis. * To eliminate the effect of these biases in studies, **mortality rates** (rather than 5-year survival rates) should be compared in a Randomized Controlled Trial (RCT).

Q: Proportional mortality rate is:

Number of deaths due to a particular cause. **Explanation:** **Proportional Mortality Rate (PMR)** is an indicator used in epidemiology to express the relative importance of a specific cause of death in relation to the total number of deaths in a population. 1. **Why Option A is Correct:** The numerator of the Proportional Mortality Rate is the **number of deaths due to a particular cause** (or in a specific age group) in a given year. The denominator is the **total number of deaths** from all causes in that same year. It is expressed as a percentage: $$\text{PMR} = \frac{\text{Deaths due to a particular cause}}{\text{Total deaths from all causes}} \times 100$$ It does not measure the risk of dying (like the Case Fatality Rate) but rather the "burden" of a specific disease relative to all-cause mortality. 2. **Why Other Options are Incorrect:** * **Option B:** "Number of deaths during that year" refers to the total mortality, which serves as the denominator for PMR, not the rate itself. * **Option C:** Mortality rates are typically calculated annually to account for seasonal variations; a one-month snapshot is not a standard epidemiological measure for PMR. **High-Yield Clinical Pearls for NEET-PG:** * **PMR vs. Case Fatality Rate (CFR):** CFR measures the killing power of a disease (Numerator: Deaths from disease; Denominator: Total cases of that disease). PMR measures the proportion of total deaths. * **PMR vs. Specific Death Rate:** In Specific Death Rate, the denominator is the **mid-year population**, whereas in PMR, the denominator is **total deaths**. * **Usefulness:** PMR is highly useful when population data (denominator) is unavailable. It helps in identifying the leading causes of death in a community. * **Common Example:** "Proportional mortality rate for communicable diseases" helps determine if a country is in the stage of epidemiological transition.

Q: What is an ogive?

Cumulative frequency curve. ### Explanation **Correct Answer: C. Cumulative frequency curve** An **Ogive** (also known as a cumulative frequency polygon) is a graphical representation of the cumulative frequency of a dataset. It is constructed by plotting the cumulative frequencies (either "less than" or "more than" type) against the upper or lower class boundaries. * **Why it is correct:** In biostatistics, while a frequency polygon shows the distribution of data points, the Ogive specifically tracks the **running total**. It is the primary tool used to determine the **Median**, quartiles, and percentiles of a distribution graphically. The point where the "less than" and "more than" ogives intersect corresponds to the Median on the x-axis. **Analysis of Incorrect Options:** * **A. Bar Chart:** Used for **qualitative (categorical)** or discrete data. Bars are separated by spaces. * **B. Histogram:** Used for **continuous quantitative** data. It consists of adjacent rectangles where the area represents the frequency. It is used to find the **Mode** graphically. * **D. Frequency Polygon:** A line graph formed by joining the midpoints of the tops of the bars in a histogram. It represents the frequency distribution of continuous data but does not show cumulative totals. **High-Yield NEET-PG Pearls:** 1. **Median** is determined by the **Ogive**. 2. **Mode** is determined by the **Histogram**. 3. **Mean** cannot be determined graphically; it must be calculated. 4. **Normal Distribution:** In a perfectly symmetrical bell-shaped curve, the Mean, Median, and Mode coincide at the same point. 5. **Scatter Diagram:** Used to show the **correlation** (relationship) between two continuous variables.

Q: What is the denominator used for calculating the infant mortality rate?

Per 1000 live births. **Explanation:** The **Infant Mortality Rate (IMR)** is a critical indicator of the overall health status of a community and the effectiveness of its maternal and child health services. It is defined as the number of deaths of children under one year of age per 1,000 live births in a given year. **1. Why Option C is Correct:** By international convention and standard epidemiological practice, the IMR is expressed as a rate **per 1,000 live births**. This standardization allows for meaningful comparisons between different regions and time periods. The formula is: $$\text{IMR} = \frac{\text{Number of deaths under 1 year of age in a year}}{\text{Total number of live births in the same year}} \times 1000$$ **2. Why Other Options are Incorrect:** * **Option A & B:** "Per live birth" or "Per 100 live births" (percentage) are not used for IMR because infant mortality is relatively rare compared to the total population; using a larger multiplier (1,000) provides a whole number that is easier to interpret and track. * **Option D:** "Per lakh (100,000) live births" is the standard denominator for the **Maternal Mortality Ratio (MMR)**, not the IMR. **3. High-Yield Clinical Pearls for NEET-PG:** * **IMR vs. MMR:** Always remember that IMR is per **1,000**, while MMR is per **1,00,000**. * **Components:** IMR includes both Neonatal Mortality (0-28 days) and Post-neonatal Mortality (28 days to 1 year). * **Best Indicator:** IMR is considered the most sensitive indicator of the availability and utilization of health care. * **Current Trend:** As per the latest SRS (Sample Registration System) data, India’s IMR has shown a steady decline, with rural rates typically higher than urban rates.

Question 1

All of the following are true regarding increasing sample size except?

Accepted Answer

Decreases power of the test

Answer

Standard error of the mean decreases

Answer

Decreases the Confidence Interval

Answer

Decreases alpha error

Question 2

What is the definition of "incidence"?

Accepted Answer

The number of new cases occurring in a defined population during a specific period.

Answer

The total number of cases (new and old) occurring in a defined population during a specific period.

Answer

The number of individuals exposed to a risk factor who develop the disease in a defined population during a specific period.

Answer

None of the above

Question 3

A correlation coefficient of +1 indicates which of the following?

Accepted Answer

A perfect positive correlation

Answer

A very weak positive correlation

Answer

A moderate positive correlation

Answer

A strong positive correlation

Question 4

Mean hemoglobin values are compared between two independent population groups. Which is the most appropriate statistical test to use?

Accepted Answer

Unpaired t-test

Answer

Paired t-test

Answer

Chi-square test

Answer

Fisher's exact test

Question 5

In a well-designed clinical trial comparing a new drug and usual care for ovarian cancer, the remission rate at one year was similar for both treatments. The P-value obtained was 0.4. What does this P-value indicate?

Accepted Answer

Neither treatment is effective.

Answer

Both treatments are effective.

Answer

The statistical power of the study is 60%.

Answer

The best estimate of the treatment effect is 0.4.

Question 6

In the calculation of crude death rate, which point in the year is the population typically considered?

Accepted Answer

July 1st

Answer

March 1st

Answer

April 1st

Answer

August 15th

Question 7

If rapidly progressive cancers are missed by a screening test, which type of bias will occur?

Accepted Answer

Length bias

Answer

Lead-time bias

Answer

Selection bias

Answer

Surveillance bias

Question 8

Proportional mortality rate is:

Accepted Answer

Number of deaths due to a particular cause

Answer

Number of deaths during that year

Answer

Number of deaths in one month

Answer

None of the above

Question 9

What is an ogive?

Accepted Answer

Cumulative frequency curve

Answer

Bar chart

Answer

Histogram

Answer

Frequency polygon

Question 10

What is the denominator used for calculating the infant mortality rate?

Accepted Answer

Per 1000 live births

Answer

Per live birth

Answer

Per 100 live births

Answer

Per lakh live births

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?