Biostatistics Practice Questions

Q: A study is performed to assess the intelligence quotient and the crime rate in a neighborhood. Students at a local high school are given an assessment and their criminal and disciplinary records are reviewed. One of the subjects scores 2 standard deviations over the mean. What percent of students did he score higher than?

97.5%. ***97.5%*** - This question relates to the **normal distribution (bell curve) and the empirical rule (68-95-99.7 rule)** [1]. - A score 2 standard deviations above the mean means that 95% of the data falls within +/- 2 standard deviations of the mean [2]. This leaves 5% outside of this range (2.5% on each tail). Therefore, the student scored higher than 95% + 2.5% = **97.5%** of students. *95%* - This percentage represents the data that falls within **2 standard deviations of the mean (both sides)**, not the percentage a score 2 standard deviations above the mean is higher than [1]. - It would be correct if the question asked for the percentage of students whose scores fall within two standard deviations of the mean. *68%* - This percentage represents the data that falls within **1 standard deviation of the mean** according to the empirical rule [1]. - A score 2 standard deviations above the mean is significantly higher than this range. *99.7%* - This percentage represents the data that falls within **3 standard deviations of the mean (both sides)**, according to the empirical rule [2]. - This would mean the student scored 3 standard deviations above the mean, which is not stated in the question.

Q: Which of the following is not a measure of dispersion?

Mean. ***Mean*** - The **mean** is a measure of **central tendency**, representing the average value of a dataset. - It describes where the center of the data lies, not how spread out the data points are. *Range* - The **range** is a measure of **dispersion** that indicates the difference between the **maximum** and **minimum** values in a dataset. - It quantifies the overall spread of the data from its lowest to highest points. *Variance* - **Variance** is a measure of **dispersion** that quantifies the **average squared deviation** of each data point from the mean. - It provides insight into how much the individual data points in a distribution deviate from the central tendency. *Standard error* - The **standard error** measures the **precision and sampling variability** of a sample statistic (e.g., sample mean) as an estimate of the population parameter. - While it relates to variability, it specifically quantifies how much a sample statistic varies across different samples, rather than measuring the dispersion of individual observations within a dataset. - In the context of this question, it is considered a measure related to dispersion, though technically it measures sampling variability.

Q: A cohort study follows 500 healthcare workers for 4 years to assess the incidence of occupational tuberculosis. During the study period, 20 workers developed tuberculosis. What is the incidence rate of tuberculosis per 1000 person-years in this cohort?

10 per 1000 person-years. ***10 per 1000 person-years*** - The **incidence rate** is calculated by dividing the number of new cases by the total person-time at risk in the population. - Total person-years = 500 workers × 4 years = **2000 person-years** - Incidence rate = 20 cases / 2000 person-years = **0.01 per person-year** - To express this per 1000 person-years: 0.01 × 1000 = **10 per 1000 person-years** - This is the correct calculation following the standard epidemiological formula for incidence rate. *5 per 1000 person-years* - This value would be obtained if the total person-years at risk were 4000 (e.g., 500 workers followed for 8 years instead of 4 years). - It underestimates the true incidence rate by using an incorrect denominator. *7.5 per 1000 person-years* - This result would occur if the person-years at risk were approximately 2667 person-years (20/2667 × 1000 = 7.5). - This reflects an incorrect calculation of the **denominator** (person-years at risk). *12.5 per 1000 person-years* - This value incorrectly assumes a denominator of 1600 person-years (20/1600 × 1000 = 12.5). - This could result from miscalculating the total follow-up time or the number of participants, leading to an overestimation of the incidence rate.

Q: Accidents happening during weekends is an example of -

Cyclic trends. ***Cyclic trends*** - Accidents happening during weekends represent a **regular, recurrent pattern** over a short period (weekly), which is characteristic of a cyclic trend. - These trends show peaks and troughs that occur at **predictable intervals**, such as every week or month. *Point source epidemic* - A **point source epidemic** refers to an outbreak where exposure to the causative agent is brief and simultaneous, resulting in a sharp rise and fall in cases, often from a single event or source. - This typically describes disease outbreaks following a contamination event, not recurring patterns of accidents over weekends. *Secular trends* - **Secular trends** describe long-term changes over many years or decades, showing a gradual increase or decrease in prevalence or incidence. - This concept is used for gradual shifts in health indicators over long periods, not for short-term weekly fluctuations. *Seasonal trends* - **Seasonal trends** refer to patterns that recur annually, often linked to changes in seasons, such as influenza outbreaks in winter or agricultural accidents in summer. - While weekends are a recurring interval, the pattern is weekly, not yearly, which distinguishes it from seasonal trends.

Q: Mean bone density amongst 2 groups of 50 people each is compared, which would be the best test:

Student t-test. ***Student t-test*** - The **Student's t-test** is the appropriate statistical test for comparing the **means of two independent groups** when the data is continuous and normally distributed. - Bone density is a **continuous variable**, and the scenario involves comparing the mean bone density between two distinct groups. *Fisher exact test* - The **Fisher exact test** is used for analyzing **categorical data** in a 2×2 contingency table, especially when sample sizes are small. - It is not suitable for comparing continuous variables like bone density. *McNemar test* - **McNemar's test** is used to analyze paired nominal data, typically when comparing two related proportions from the same subjects before and after an intervention. - This scenario involves **independent groups**, not paired data. *Chi-square test* - The **chi-square test** is primarily used to compare **categorical variables** to see if there is a significant association between them. - It's not appropriate for comparing the means of continuous data like bone density.

Q: For a positively skewed curve, which measure of central tendency is largest?

Mean. ***Mean*** - In a **positively skewed distribution**, the tail of the distribution extends towards higher values, pulling the **mean** in that direction, making it the largest among the three measures of central tendency. - The presence of **outliers** with large values in the tail disproportionately increases the mean. *Mode* - The **mode** represents the most frequently occurring value in the data set. - In a positively skewed distribution, the mode will be located at the **peak of the distribution**, which is typically the smallest value among the three measures of central tendency. *All are equal* - This statement is characteristic of a **perfectly symmetrical distribution** (e.g., a normal distribution), where the **mean, median, and mode** are all equal. - A positively skewed curve is asymmetrical, meaning these measures will not be equal. *Median* - The **median** is the middle value in an ordered data set, dividing the data into two equal halves. - In a positively skewed distribution, the median will be shifted towards the right of the mode but will still be to the left of the mean, meaning it is **smaller than the mean**.

Q: Most appropriate measure for central tendency when data includes extreme values?

Median. ***Median*** - The **median** is less affected by **extreme values** or **outliers** because it represents the middle value in an ordered dataset. - It provides a more robust measure of central tendency when the data distribution is **skewed**. *Mode* - The **mode** represents the most frequently occurring value in a dataset; it does not account for the magnitude of other values. - While it is not influenced by extreme values, it may not accurately represent the central tendency of a continuous dataset, especially if there are **multiple modes** or if the most frequent value is not central. *Mean* - The **mean** is calculated by summing all values and dividing by the number of values, making it highly susceptible to **extreme values** or **outliers**. - A single very large or very small value can significantly distort the mean, pulling it away from the true center of most data points. *Geometric mean* - The **geometric mean** is primarily used for data that is **multiplicative** in nature or when dealing with rates of change, or positively skewed distributions. - While it can be less sensitive to extreme values than the arithmetic mean for certain types of data, it is not the most appropriate general measure for central tendency when outliers are present without specific multiplicative contexts.

Question 1

Interpret the following graph.

Accepted Answer

Normal, positively skewed, negatively skewed, normal with outliers

Answer

Normal, negatively skewed, positively skewed, skewed with outliers

Answer

Skewed with outliers, positively skewed, negatively skewed, normal

Answer

Normal, negatively skewed, positively skewed, normal with outliers

Question 2

A study is performed to assess the intelligence quotient and the crime rate in a neighborhood. Students at a local high school are given an assessment and their criminal and disciplinary records are reviewed. One of the subjects scores 2 standard deviations over the mean. What percent of students did he score higher than?

Accepted Answer

97.5%

Answer

95%

Answer

68%

Answer

99.7%

Question 3

The principal investigators of both studies recently met at a rheumatology conference. They both expressed an interest in combining data from their individual studies to be analyzed in a single study. A third researcher at the conference, who conducted her own project on the same topic recently, has also indicated she would like to contribute data to a pooled analysis. Which of the following statements regarding their new study design is true?

Accepted Answer

The results are more precise in comparison to individual studies

Answer

It overcomes limitations in the quality of individual studies

Answer

It is unable to resolve differences in outcomes between individual studies

Answer

There is a decreased likelihood of type I error

Answer

It has a lower level of clinical evidence than an individual cohort study

Question 4

Which of the following is not a measure of dispersion?

Accepted Answer

Mean

Answer

Range

Answer

Variance

Answer

Standard error

Question 5

A cohort study follows 500 healthcare workers for 4 years to assess the incidence of occupational tuberculosis. During the study period, 20 workers developed tuberculosis. What is the incidence rate of tuberculosis per 1000 person-years in this cohort?

Accepted Answer

10 per 1000 person-years

Answer

5 per 1000 person-years

Answer

12.5 per 1000 person-years

Answer

7.5 per 1000 person-years

Question 6

Accidents happening during weekends is an example of -

Accepted Answer

Cyclic trends

Answer

Point source epidemic

Answer

Secular trends

Answer

Seasonal trends

Question 7

Mean bone density amongst 2 groups of 50 people each is compared, which would be the best test:

Accepted Answer

Student t-test

Answer

McNemar test

Answer

Chi-square test

Answer

Fisher exact test

Question 8

You have diagnosed a patient clinically as having SLE and ordered 6 tests out of which 4 tests have come positive and 2 are negative. Which of the following values are required to determine the probability of SLE at this point?

Accepted Answer

Prior probability of SLE, sensitivity and specificity of each test

Answer

Relative risk of SLE in the patient

Answer

Incidence and prevalence of SLE

Answer

Incidence of SLE and the predictive value of each test

Question 9

For a positively skewed curve, which measure of central tendency is largest?

Accepted Answer

Mean

Answer

Mode

Answer

All are equal

Answer

Median

Question 10

Most appropriate measure for central tendency when data includes extreme values?

Accepted Answer

Median

Answer

Mode

Answer

Mean

Answer

Geometric mean

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?