Biostatistics Practice Questions

Q: Which of the following statements is true for a left-skewed distribution?

Median > Mean. ***Median > Mean*** - In a **left-skewed distribution**, the bulk of the data is on the right, and the tail extends to the left, pulling the **mean** towards the lower values. - This pull results in the **mean** being less than the **median**, which is less affected by extreme values in the tail. *Mean = Median* - This relationship holds true for a **symmetrical distribution**, such as a **normal distribution**, where the data is evenly distributed around the center. - In a **skewed distribution**, the mean and median will diverge due to the presence of outliers or extreme values on one side. *Mean>Mode* - This statement is characteristic of a **right-skewed distribution**, where the tail extends to the right, pulling the **mean** to a higher value than the **mode**. - In a right-skewed distribution, typically **mode < median < mean**. *Mean < Mode* - This statement indicates that the **mode** (the most frequent value) is greater than the **mean**, which is not a defining characteristic of a left-skewed distribution. - While it can occur, the primary relationship for left-skewness is **mean < median**.

Q: Which of the following statements about the population pyramid of India is incorrect?

India has narrow base. ***Correct Answer: India has narrow base*** - A **narrow base** in a population pyramid indicates a **low birth rate** and a small proportion of young people. - This statement is **INCORRECT for India**, as India's population pyramid has a **broad base** due to high birth rates and a large proportion of children and young people. - This is the correct answer because the question asks for the incorrect statement. *Incorrect Option: India has narrow apex* - A **narrow apex** signifies a **smaller proportion of older individuals**, indicating lower life expectancy. - This is TRUE for India's population pyramid, making it an incorrect answer choice. *Incorrect Option: Developing countries have bulge in the center* - A **bulge in the center** represents a larger cohort of working-age adults in developing countries undergoing demographic transition. - This reflects improvements in childhood survival and declining (but still substantial) birth rates. - This is TRUE, making it an incorrect answer choice. *Incorrect Option: India has broad base* - A **broad base** indicates a **high birth rate** and large proportion of young children in the population. - This is TRUE and characteristic of India's population structure, making it an incorrect answer choice.

Q: Which graphical representation is best suited for depicting continuous quantitative data?

Histogram. **Histogram** - A **histogram** is specifically designed for depicting the distribution of **continuous quantitative data** by dividing the data into bins and showing the frequency of data points within each bin. - The bars in a histogram are adjacent, indicating the continuous nature of the data and representing ranges of values. *Bar diagram* - A **bar diagram** (or bar chart) is typically used for comparing **discrete categories** or displaying changes over time for categorical data. - The bars in a bar diagram are usually separated, emphasizing distinct categories rather than continuous ranges. *Pie chart* - A **pie chart** is used to show the **proportions of a whole**, representing parts of a composition for categorical data. - It is not suitable for continuous data as it provides no information about the distribution or frequency across a range of values. *Pictogram* - A **pictogram** uses images or icons to represent data, making it visually engaging, but it is generally used for **simple comparisons of discrete or categorical data**. - It lacks the precision and detail required to accurately depict the distribution or frequency of continuous quantitative data.

Q: What does a highly sensitive test imply about its false negative rate?

Low false negative rate. ***Low false negative rate*** - A highly **sensitive test** is good at identifying true positives, meaning it correctly identifies most people who have the disease. - Sensitivity = TP/(TP+FN), so high sensitivity mathematically means few false negatives. - This characteristic directly translates to a **low false negative rate**, as few people with the disease will be missed. *High false positive rate* - A high **false positive rate** relates to **specificity**, not sensitivity. - False positive rate = FP/(FP+TN), which measures how many healthy people are incorrectly identified as diseased. - While some sensitive tests may have lower specificity (higher FP rate), this is not a direct implication of high sensitivity. *High true negative rate* - A high **true negative rate** is a characteristic of a highly **specific** test, which correctly identifies people who do **not** have the disease. - True negative rate = TN/(TN+FP) = Specificity. - **Sensitivity** and **specificity** are independent measures, so high sensitivity does not imply a high true negative rate. *High true positive rate* - High **true positive rate** is actually another term for high sensitivity (Sensitivity = TPR = TP/(TP+FN)). - While this is true of a sensitive test, the question specifically asks about the implication for the **false negative rate**. - The **most direct answer** regarding false negatives is "low false negative rate" rather than describing the true positive rate.

Q: In epidemiological studies, which type of diagram is most effective for representing disease incidence trends over time?

Line graph. ***Line graph*** - A **line graph** is ideal for visualizing **trends over time** because it connects data points sequentially, making it easy to observe increases, decreases, or stability in disease incidence. - The x-axis typically represents **time intervals** (e.g., years, months), and the y-axis represents the incidence rate, clearly showing how these values change. *Bar graph* - A **bar graph** is generally used for comparing **discrete categories** or displaying quantities for different groups, not for continuous trends over time. - While it can show incidence for different time periods, it doesn't convey the **continuity** or the overall progression as effectively as a line graph. *Scatter plot* - A **scatter plot** is primarily used to display the **relationship between two numerical variables** or to identify correlations. - It does not inherently show a **trend over time** as clearly as a line graph; instead, it shows individual data points and their distribution. *Pie chart* - A **pie chart** is used to show **proportions or percentages** of a whole, making it suitable for displaying the distribution of categories at a single point in time. - It is **not appropriate** for showing changes or trends over time, as it cannot effectively represent sequential data or temporal patterns.

Q: An investigator wants to know the similarity of the mean peak flow of expiratory rates among non-smokers, light smokers, moderate smokers, and heavy smokers. Which statistical test of significance is appropriate?

One way ANOVA. ***One way ANOVA*** - This test is appropriate for comparing the means of **three or more independent groups** (non-smokers, light, moderate, heavy smokers) on a **single quantitative dependent variable** (peak flow of expiratory rates). - It determines if there's a statistically significant difference between the means of these groups, indicating at least one group mean is different from the others. *Two way ANOVA* - This test is used when there are **two independent categorical variables** (factors) influencing a single continuous dependent variable. - In this scenario, there is only one independent categorical variable (smoking status) with multiple levels. *Student-t test* - The Student-t test is used to compare the means of **only two groups**. - Since this question involves comparing the means of four groups of smokers, a t-test would not be appropriate. *Chi square test* - The Chi-square test is used for analyzing the association between **two categorical variables**. - Here, one variable (peak flow) is continuous, making the Chi-square test unsuitable.

Q: What is the most appropriate statistical test to test the statistical significance of the change in blood cholesterol levels after a month's treatment with atorvastatin?

Paired t-test. ***Paired t-test*** * A **paired t-test** is appropriate when comparing two means from the **same group of subjects** measured at two different time points (before and after treatment). * In this scenario, a single group's blood cholesterol levels are measured *before* and *after* atorvastatin treatment, making the observations dependent. *Unpaired or independent t-test* * An **unpaired t-test** is used to compare the means of two *independent* groups. * It would be used, for instance, if cholesterol levels were being compared between a group receiving atorvastatin and a separate control group. *Analysis of variance* * **Analysis of variance (ANOVA)** is used to compare **three or more means**. * It would be appropriate if there were multiple treatment groups or multiple time points for comparison beyond just two. *Chi-square test* * The **Chi-square test** is used to examine the association between **categorical variables**. * It would not be suitable here, as blood cholesterol level is a continuous numerical variable, not a categorical one.

Q: For testing the statistical significance of the difference in heights among different groups of school children, which statistical test would be most appropriate?

ANOVA. ***ANOVA (Analysis of Variance)*** - **ANOVA** is used to compare the means of **three or more independent groups** simultaneously. In this scenario, you are comparing heights across "different groups" of school children, implying more than two groups. - It tests whether there are any significant differences between the means of these groups, using the **F-statistic**. *Student's t test* - The **Student's t-test** is designed to compare the means of **only two groups**. It would be inappropriate for comparing more than two groups. - Applying multiple t-tests for several groups would increase the risk of **Type I error** (false positive). *chi-square test* - The **chi-square test** is used for analyzing **categorical data** (frequencies or proportions), not for comparing means of continuous data like height. - It determines if there is a significant association between two categorical variables. *Paired 't' test* - A **paired t-test** is used when comparing the means of two related groups or when measurements are taken from the **same subjects at two different times** (e.g., before and after an intervention). - This scenario involves independent groups of children, not paired or repeated measures.

Q: What is the 95% confidence interval for the intraocular pressure (IOP) in the 400 people, given a mean of 25 mm Hg and a standard deviation of 10 mm Hg?

24-26. ***24-26*** - This is the correct 95% confidence interval calculated using the formula: **mean ± (Z-score × standard error of the mean)**. - For a 95% confidence interval, the **Z-score is 1.96**. - The **standard error of the mean (SEM)** = standard deviation / √(sample size) = 10 / √400 = 10 / 20 = **0.5**. - Therefore: 25 ± (1.96 × 0.5) = 25 ± 0.98 = **24.02 to 25.98**, which rounds to **24-26**. *22-28* - This interval is too wide for a 95% confidence interval with the given parameters. - An interval of ±3 would correspond to a Z-score of 3/0.5 = 6, which is far beyond the **1.96 required for 95% confidence**. - This would represent a much higher confidence level (>99.9%). *23-27* - This interval is slightly too wide, implying a larger margin of error than calculated. - A range of ±2 would require a Z-score of 2/0.5 = 4 times the SEM, which **overestimates the 95% confidence interval**. - This would correspond to approximately 99.99% confidence. *21-29* - This interval is significantly too wide for a 95% confidence interval. - An interval of ±4 would require a Z-score of 4/0.5 = 8 times the SEM, which would correspond to an **extremely high confidence level** (virtually 100%). - This dramatically exceeds what is needed for 95% confidence.

Q: What is the 95% confidence interval in a study with an estimated prevalence of 10% and a sample size of 100, expressed as a percentage range?

4% to 16%. ***4% to 16%*** - To calculate the 95% **confidence interval** for a **proportion**, we use the formula: p ± 1.96 * sqrt((p * (1-p)) / n). - Given a prevalence (**p**) of 0.10 and a **sample size** (**n**) of 100, the standard error is sqrt((0.10 * 0.90) / 100) = sqrt(0.0009) = 0.03. - The 95% confidence interval is 0.10 ± (1.96 * 0.03), which is 0.10 ± 0.0588. This translates to a range of 0.0412 to 0.1588, or approximately **4% to 16%**. *Inadequate information to calculate 95% CI* - The necessary information, including **prevalence** (10%) and **sample size** (100), is provided in the question. - With these two **parameters**, the 95% confidence interval can be calculated using standard statistical formulas. *6% to 16%* - This range is too narrow and suggests a smaller **standard error** or a different **confidence level**. - The correct calculation based on the provided **prevalence** and **sample size** yields a wider interval. *5% to 15%* - This range, while plausible, is slightly narrower than the **calculated interval**. - The use of the standard formula for a **proportion** with the given values results in a lower bound closer to 4% and an upper bound closer to 16%.

Question 1

Which of the following statements is true for a left-skewed distribution?

Accepted Answer

Median > Mean

Answer

Mean = Median

Answer

Mean>Mode

Answer

Mean < Mode

Question 2

Which of the following statements about the population pyramid of India is incorrect?

Accepted Answer

India has narrow base

Answer

India has narrow apex

Answer

Developing countries have bulge in the center

Answer

India has broad base

Question 3

Which graphical representation is best suited for depicting continuous quantitative data?

Accepted Answer

Histogram

Answer

Bar diagram

Answer

Pie chart

Answer

Pictogram

Question 4

What does a highly sensitive test imply about its false negative rate?

Accepted Answer

Low false negative rate

Answer

High false positive rate

Answer

High true negative rate

Answer

High true positive rate

Question 5

In epidemiological studies, which type of diagram is most effective for representing disease incidence trends over time?

Accepted Answer

Line graph

Answer

Bar graph

Answer

Scatter plot

Answer

Pie chart

Question 6

An investigator wants to know the similarity of the mean peak flow of expiratory rates among non-smokers, light smokers, moderate smokers, and heavy smokers. Which statistical test of significance is appropriate?

Accepted Answer

One way ANOVA

Answer

Two way ANOVA

Answer

Student-t test

Answer

Chi square test

Question 7

What is the most appropriate statistical test to test the statistical significance of the change in blood cholesterol levels after a month's treatment with atorvastatin?

Accepted Answer

Paired t-test

Answer

Unpaired or independent t-test

Answer

Analysis of variance

Answer

Chi-square test

Question 8

For testing the statistical significance of the difference in heights among different groups of school children, which statistical test would be most appropriate?

Accepted Answer

ANOVA

Answer

Student's t test

Answer

chi-square test

Answer

Paired 't' test

Question 9

What is the 95% confidence interval for the intraocular pressure (IOP) in the 400 people, given a mean of 25 mm Hg and a standard deviation of 10 mm Hg?

Accepted Answer

24-26

Answer

22-28

Answer

23-27

Answer

21-29

Question 10

What is the 95% confidence interval in a study with an estimated prevalence of 10% and a sample size of 100, expressed as a percentage range?

Accepted Answer

4% to 16%

Answer

Inadequate information to calculate 95% CI

Answer

6% to 16%

Answer

5% to 15%

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?