Study Design Practice Questions

Q: A neuro-oncology investigator has recently conducted a randomized controlled trial in which the addition of a novel alkylating agent to radiotherapy was found to prolong survival in comparison to radiotherapy alone (HR = 0.7, p < 0.01). A number of surviving participants who took the alkylating agent reported that they had experienced significant nausea from the medication. The investigator surveyed all participants in both the treatment and the control group on their nausea symptoms by self-report rated mild, moderate, or severe. The investigator subsequently compared the two treatment groups with regards to nausea level. | | Mild nausea | Moderate nausea | Severe nausea | |---|---|---|---| | Treatment group (%) | 20 | 30 | 50 | | Control group (%) | 35 | 35 | 30 | Which of the following statistical methods would be most appropriate to assess the statistical significance of these results?

Chi-square test. **Chi-square test** - The **Chi-square test** is appropriate for comparing **categorical data** (mild, moderate, severe) between two or more independent groups (treatment vs. control). - It assesses whether there is a statistically significant association between the two categorical variables (treatment group and nausea severity). *Pearson correlation coefficient* - The **Pearson correlation coefficient** is used to measure the **linear relationship** between two **continuous variables**. - Nausea severity (mild, moderate, severe) is an **ordinal categorical variable**, not a continuous one. *Multiple logistic regression* - **Multiple logistic regression** is used to predict a **binary outcome** (e.g., presence or absence of nausea) based on one or more independent variables, which can be continuous or categorical. - The outcome here is **ordinal categorical** (mild, moderate, severe nausea), not binary. While logistic regression can be adapted for ordinal outcomes, a simpler Chi-square test is more direct for comparing distributions without prediction. *Unpaired t-test* - An **unpaired t-test** is used to compare the **means of two independent continuous variables**. - Nausea levels are categorical, and we are interested in comparing proportions within categories, not means. *Paired t-test* - A **paired t-test** is used to compare the **means of two related (paired) continuous variables**. - The study involves independent treatment and control groups, and the nausea data is categorical, making the paired t-test unsuitable.

Q: A survey was conducted in a US midwestern town in an effort to assess maternal mortality over the past year. The data from the survey are given in the table below: Women of childbearing age 250,000 Maternal deaths 2,500 Number of live births 100, 000 Number of deaths of women of childbearing age 7,500 Maternal death is defined as the death of a woman while pregnant or within 42 days of termination of pregnancy from any cause related to or aggravated by, the pregnancy. Which of the following is the maternal mortality rate in this midwestern town?

2,500 per 100,000 live births. ***2,500 per 100,000 live births*** - The maternal mortality rate is calculated as the number of **maternal deaths** per 100,000 **live births**. The given data directly provide these values. - Calculation: (2,500 maternal deaths / 100,000 live births) × 100,000 = **2,500 per 100,000 live births**. *1,000 per 100,000 live births* - This value is incorrect as it does not align with the provided numbers for maternal deaths and live births in the calculation. - It might result from a miscalculation or using incorrect numerator/denominator values from the dataset. *33 per 100,000 live births* - This value is significantly lower than the correct rate and suggests a substantial error in calculation or an incorrect understanding of how the maternal mortality rate is derived. - It could potentially result from dividing the number of live births by maternal deaths, which is the inverse of the correct formula. *3,000 per 100,000 live births* - This option is close to the correct answer but slightly higher, indicating a possible calculation error, for instance, including non-maternal deaths or other causes of deaths in the numerator. - The definition of maternal death is specific to pregnancy-related or aggravated causes, so extraneous deaths would inflate the rate. *33,300 per 100,000 live births* - This figure results from incorrectly calculating the proportion of maternal deaths among all deaths of women of childbearing age: (2,500 / 7,500) × 100,000 = 33,333. - This is a conceptual error as the maternal mortality rate should use live births as the denominator, not total deaths of women of childbearing age.

Q: The height of American adults is expected to follow a normal distribution, with a typical male adult having an average height of 69 inches with a standard deviation of 0.1 inches. An investigator has been informed about a community in the American Midwest with a history of heavy air and water pollution in which a lower mean height has been reported. The investigator plans to sample 30 male residents to test the claim that heights in this town differ significantly from the national average based on heights assumed be normally distributed. The significance level is set at 10% and the probability of a type 2 error is assumed to be 15%. Based on this information, which of the following is the power of the proposed study?

0.85. ***0.85*** - **Power** is defined as **1 - β**, where β is the **probability of a Type II error**. - Given that the probability of a **Type II error (β)** is 15% or 0.15, the power of the study is 1 - 0.15 = **0.85**. *0.10* - This value represents the **significance level (α)**, which is the probability of committing a **Type I error** (rejecting a true null hypothesis). - The significance level is distinct from the **power of the study**, which relates to Type II errors. *0.90* - This value would be the power if the **Type II error rate (β)** was 0.10 (1 - 0.10 = 0.90), but the question specifies a β of 0.15. - It is also the complement of the significance level (1 - α), which is not the definition of power. *0.15* - This value is the **probability of a Type II error (β)**, not the power of the study. - **Power** is the probability of correctly rejecting a false null hypothesis, which is 1 - β. *0.05* - While 0.05 is a common significance level (α), it is not given as the significance level in this question (which is 0.10). - This value also does not represent the power of the study, which would be calculated using the **Type II error rate**.

Q: An academic medical center in the United States is approached by a pharmaceutical company to run a small clinical trial to test the effectiveness of its new drug, compound X. The company wants to know if the measured hemoglobin a1c (Hba1c) of patients with type 2 diabetes receiving metformin and compound X would be lower than that of control subjects receiving only metformin. After a year of study and data analysis, researchers conclude that the control and treatment groups did not differ significantly in their Hba1c levels. However, parallel clinical trials in several other countries found that compound X led to a significant decrease in Hba1c. Interested in the discrepancy between these findings, the company funded a larger study in the United States, which confirmed that compound X decreased Hba1c levels. After compound X was approved by the FDA, and after several years of use in the general population, outcomes data confirmed that it effectively lowered Hba1c levels and increased overall survival. What term best describes the discrepant findings in the initial clinical trial run by institution A?

Type II error. ***Type II error*** - A **Type II error** occurs when a study fails to **reject a false null hypothesis**, meaning it concludes there is no significant difference or effect when one actually exists. - In this case, the initial US trial incorrectly concluded that Compound X had no significant effect on HbA1c, while subsequent larger studies and real-world data proved it did. *Type I error* - A **Type I error** (alpha error) occurs when a study incorrectly **rejects a true null hypothesis**, concluding there is a significant difference or effect when there isn't. - This scenario describes the opposite: the initial study failed to find an effect that genuinely existed, indicating a Type II error, not a Type I error. *Hawthorne effect* - The **Hawthorne effect** is a type of reactivity in which individuals modify an aspect of their behavior in response to their awareness of being observed. - This effect does not explain the initial trial's failure to detect a real drug effect; rather, it relates to participants changing behavior due to study participation itself. *Publication bias* - **Publication bias** occurs when studies with positive or statistically significant results are more likely to be published than those with negative or non-significant results. - While relevant to the literature as a whole, it doesn't explain the discrepancy in findings within a single drug's development where a real effect was initially missed. *Confirmation bias* - **Confirmation bias** is the tendency to search for, interpret, favor, and recall information in a way that confirms one's preexisting beliefs or hypotheses. - This bias would likely lead researchers to *find* an effect if they expected one, or to disregard data that contradicts their beliefs, which is not what happened in the initial trial.

Q: Group of 100 medical students took an end of the year exam. The mean score on the exam was 70%, with a standard deviation of 25%. The professor states that a student's score must be within the 95% confidence interval of the mean to pass the exam. Which of the following is the minimum score a student can have to pass the exam?

65%. ***65%*** - To find the **95% confidence interval (CI) of the mean**, we use the formula: Mean ± (Z-score × Standard Error). For a 95% CI, the Z-score is approximately **1.96**. - The **Standard Error (SE)** is calculated as SD/√n, where n is the sample size (100 students). So, SE = 25%/√100 = 25%/10 = **2.5%**. - The 95% CI is 70% ± (1.96 × 2.5%) = 70% ± 4.9%. The lower bound is 70% - 4.9% = **65.1%**, which rounds to **65%** as the minimum passing score. *45%* - This value is significantly lower than the calculated lower bound of the 95% confidence interval (approximately 65.1%). - It would represent a score far outside the defined passing range. *63.75%* - This value falls below the calculated lower bound of the 95% confidence interval (approximately 65.1%). - While close, this score would not meet the professor's criterion for passing. *67.5%* - This value is within the 95% confidence interval (65.1% to 74.9%) but is **not the minimum score**. - Lower scores within the interval would still qualify as passing. *20%* - This score is extremely low and falls significantly outside the 95% confidence interval for a mean of 70%. - It would indicate performance far below the defined passing threshold.

Q: A researcher is investigating the relationship between interleukin-1 (IL-1) levels and mortality in patients with end-stage renal disease (ESRD) on hemodialysis. In 2017, 10 patients (patients 1–10) with ESRD on hemodialysis were recruited for a pilot study in which IL-1 levels were measured (mean = 88.1 pg/mL). In 2018, 5 additional patients (patients 11–15) were recruited. Results are shown: Patient IL-1 level (pg/mL) Patient IL-1 level (pg/mL) Patient 1 (2017) 84 Patient 11 (2018) 91 Patient 2 (2017) 87 Patient 12 (2018) 32 Patient 3 (2017) 95 Patient 13 (2018) 86 Patient 4 (2017) 93 Patient 14 (2018) 90 Patient 5 (2017) 99 Patient 15 (2018) 81 Patient 6 (2017) 77 Patient 7 (2017) 82 Patient 8 (2017) 90 Patient 9 (2017) 85 Patient 10 (2017) 89 Which of the following statements about the results of the study is most accurate?

The median of IL-1 measurements is now larger than the mean.. ***The median of IL-1 measurements is now larger than the mean.*** - The new mean is 85.47 (sum of all IL-1 levels divided by 15). The sorted data set is 32, 77, 81, 82, 84, 85, 86, **87**, 89, 90, 90, 91, 93, 95, 99; the median is the 8th value, which is 87. Thus, the new median (87) is larger than the new mean (85.47). - This conclusion requires calculation of both the **mean** and **median** for the combined dataset of 15 patients. *The mean of IL-1 measurements is now larger than the mode.* - The new mean is 85.47. The mode is 90 (it appears twice, while all other values appear once). Therefore, the mean (85.47) is *not* larger than the mode (90). - Calculation of the **mean** and identification of the **mode** for the combined dataset negates this statement. *The range of the data set is unaffected by the addition of five new patients in 2018.* - In 2017, the range was 99 (max) - 77 (min) = 22. With the addition of patient 12 (IL-1 level of 32), the new minimum changed from 77 to 32. - The new range is 99 (max) - 32 (min) = 67, which is a significant increase from the original range of 22. *The standard deviation was decreased by the five new patients who joined the study in 2018.* - The addition of patient 12 with an IL-1 level of 32, which is an **outlier**, significantly increased the **spread of the data**. - A larger spread of data, especially due to an outlier, typically **increases the standard deviation**, not decreases it. *Systematic error was introduced by the five new patients who joined the study in 2018.* - **Systematic error** refers to a consistent, repeatable error in measurement or experimental design that biases results in a particular direction. - The information provided describes individual patient data and does not indicate any **consistent bias** in data collection or measurement methods for the new patients.

Q: A group of researchers is trying to create a new drug that more effectively decreases systolic blood pressure levels, and it has entered the clinical trial period of their drug's development. If, during their trial, the scientists wanted to examine a mutual or linear relationship between 2 continuous variables, which of the following statistical models would be most appropriate for them to use?

Correlation. ***Correlation*** - **Correlation** is used to assess the strength and direction of a **linear relationship** between two **continuous variables**. - In this scenario, researchers would use it to determine if there's a relationship between drug dosage and systolic blood pressure, as both are continuous. *Chi-square test* - The **chi-square test** is used to examine the relationship between two **categorical variables**. - It is not appropriate for understanding linear relationships between continuous variables like drug dosage and blood pressure. *Analysis of variance* - **Analysis of variance (ANOVA)** is used to compare the means of **three or more groups** or treatments. - It identifies if there are statistically significant differences between group means, rather than analyzing the mutual relationship between two continuous variables. *Paired t-test* - A **paired t-test** is used to compare the means of **two related groups** or repeated measurements from the same subjects. - It is often used to assess the effect of an intervention by comparing measurements before and after the intervention, not for observing a relationship between two continuous variables. *Independent t-test* - An **independent t-test** compares the means of **two independent groups**. - This test is not suitable for exploring a mutual or linear relationship between two continuous variables within a single group or dataset.

Question 1

Study X examined the relationship between coffee consumption and lung cancer. The authors of Study X retrospectively reviewed patients' reported coffee consumption and found that drinking greater than 6 cups of coffee per day was associated with an increased risk of developing lung cancer. However, Study X was criticized by the authors of Study Y. Study Y showed that increased coffee consumption was associated with smoking. What type of bias affected Study X, and what study design is geared to reduce the chance of that bias?

Accepted Answer

Confounding; randomization

Answer

Observer bias; double blind analysis

Answer

Selection bias; randomization

Answer

Lead time bias; placebo

Answer

Measurement bias; blinding

Question 2

A neuro-oncology investigator has recently conducted a randomized controlled trial in which the addition of a novel alkylating agent to radiotherapy was found to prolong survival in comparison to radiotherapy alone (HR = 0.7, p < 0.01). A number of surviving participants who took the alkylating agent reported that they had experienced significant nausea from the medication. The investigator surveyed all participants in both the treatment and the control group on their nausea symptoms by self-report rated mild, moderate, or severe. The investigator subsequently compared the two treatment groups with regards to nausea level.

| | Mild nausea | Moderate nausea | Severe nausea |
|---|---|---|---|
| Treatment group (%) | 20 | 30 | 50 |
| Control group (%) | 35 | 35 | 30 |

Which of the following statistical methods would be most appropriate to assess the statistical significance of these results?

Accepted Answer

Chi-square test

Answer

Pearson correlation coefficient

Answer

Multiple logistic regression

Answer

Unpaired t-test

Answer

Paired t-test

Question 3

A group of investigators seeks to compare the non-inferiority of a new angiotensin receptor blocker, salisartan, with losartan for reduction of blood pressure. 2,000 patients newly diagnosed with hypertension are recruited for the trial; the first 1,000 recruited patients are administered losartan, and the other half are administered salisartan. Patients with a baseline systolic blood pressure less than 100 mmHg are excluded from the study. Blood pressure is measured every week for four weeks, with the primary outcome being a reduction in systolic blood pressure by salisartan within 10% of that of the control. Secondary outcomes include incidence of subjective improvement in symptoms, improvement of ejection fraction, and incidence of cough. 500 patients withdraw from the study due to symptomatic side effects. In an intention-to-treat analysis, salisartan is deemed to be non-inferior to losartan for the primary outcome but inferior for all secondary outcomes. As the investigators launch a national advertising campaign for salisartan, independent groups report that the drug is inferior for its primary outcome compared to losartan and associated with respiratory failure among patients with pulmonary hypertension. How could this study have been improved?

Accepted Answer

Randomization

Answer

Increased study duration

Answer

Posthoc analysis of primary outcome among patients who withdrew from study

Answer

Increased sample size

Answer

Retrial of primary outcome for clinical effectiveness instead of non-inferiority

Question 4

A survey was conducted in a US midwestern town in an effort to assess maternal mortality over the past year. The data from the survey are given in the table below:
Women of childbearing age 250,000
Maternal deaths 2,500
Number of live births 100, 000
Number of deaths of women of childbearing age 7,500
Maternal death is defined as the death of a woman while pregnant or within 42 days of termination of pregnancy from any cause related to or aggravated by, the pregnancy. Which of the following is the maternal mortality rate in this midwestern town?

Accepted Answer

2,500 per 100,000 live births

Answer

1,000 per 100,000 live births

Answer

33 per 100,000 live births

Answer

3,000 per 100,000 live births

Answer

33,300 per 100,000 live births

Question 5

The height of American adults is expected to follow a normal distribution, with a typical male adult having an average height of 69 inches with a standard deviation of 0.1 inches. An investigator has been informed about a community in the American Midwest with a history of heavy air and water pollution in which a lower mean height has been reported. The investigator plans to sample 30 male residents to test the claim that heights in this town differ significantly from the national average based on heights assumed be normally distributed. The significance level is set at 10% and the probability of a type 2 error is assumed to be 15%. Based on this information, which of the following is the power of the proposed study?

Accepted Answer

0.85

Answer

0.10

Answer

0.90

Answer

0.15

Answer

0.05

Question 6

Which of the following study designs would be most appropriate to investigate the association between electronic cigarette use and the subsequent development of lung cancer?

Accepted Answer

Subjects who smoke electronic cigarettes and subjects who do not smoke

Answer

Subjects with lung cancer who smoke and subjects with lung cancer who did not smoke

Answer

Subjects who smoke electronic cigarettes and subjects who smoke normal cigarettes

Answer

Subjects with lung cancer who smoke and subjects without lung cancer who smoke

Answer

Subjects with lung cancer and subjects without lung cancer

Question 7

An academic medical center in the United States is approached by a pharmaceutical company to run a small clinical trial to test the effectiveness of its new drug, compound X. The company wants to know if the measured hemoglobin a1c (Hba1c) of patients with type 2 diabetes receiving metformin and compound X would be lower than that of control subjects receiving only metformin. After a year of study and data analysis, researchers conclude that the control and treatment groups did not differ significantly in their Hba1c levels.

However, parallel clinical trials in several other countries found that compound X led to a significant decrease in Hba1c. Interested in the discrepancy between these findings, the company funded a larger study in the United States, which confirmed that compound X decreased Hba1c levels. After compound X was approved by the FDA, and after several years of use in the general population, outcomes data confirmed that it effectively lowered Hba1c levels and increased overall survival. What term best describes the discrepant findings in the initial clinical trial run by institution A?

Accepted Answer

Type II error

Answer

Type I error

Answer

Hawthorne effect

Answer

Publication bias

Answer

Confirmation bias

Question 8

Group of 100 medical students took an end of the year exam. The mean score on the exam was 70%, with a standard deviation of 25%. The professor states that a student's score must be within the 95% confidence interval of the mean to pass the exam. Which of the following is the minimum score a student can have to pass the exam?

Accepted Answer

65%

Answer

45%

Answer

63.75%

Answer

67.5%

Answer

20%

Question 9

A researcher is investigating the relationship between interleukin-1 (IL-1) levels and mortality in patients with end-stage renal disease (ESRD) on hemodialysis. In 2017, 10 patients (patients 1–10) with ESRD on hemodialysis were recruited for a pilot study in which IL-1 levels were measured (mean = 88.1 pg/mL). In 2018, 5 additional patients (patients 11–15) were recruited. Results are shown:
Patient IL-1 level (pg/mL) Patient IL-1 level (pg/mL)
Patient 1 (2017) 84 Patient 11 (2018) 91
Patient 2 (2017) 87 Patient 12 (2018) 32
Patient 3 (2017) 95 Patient 13 (2018) 86
Patient 4 (2017) 93 Patient 14 (2018) 90
Patient 5 (2017) 99 Patient 15 (2018) 81
Patient 6 (2017) 77
Patient 7 (2017) 82
Patient 8 (2017) 90
Patient 9 (2017) 85
Patient 10 (2017) 89
Which of the following statements about the results of the study is most accurate?

Accepted Answer

The median of IL-1 measurements is now larger than the mean.

Answer

The mean of IL-1 measurements is now larger than the mode.

Answer

The standard deviation was decreased by the five new patients who joined the study in 2018.

Answer

Systematic error was introduced by the five new patients who joined the study in 2018.

Answer

The range of the data set is unaffected by the addition of five new patients in 2018.

Question 10

A group of researchers is trying to create a new drug that more effectively decreases systolic blood pressure levels, and it has entered the clinical trial period of their drug's development. If, during their trial, the scientists wanted to examine a mutual or linear relationship between 2 continuous variables, which of the following statistical models would be most appropriate for them to use?

Accepted Answer

Correlation

Answer

Chi-square test

Answer

Analysis of variance

Answer

Paired t-test

Answer

Independent t-test

Study Design — MCQs

Study Design — MCQs

On this page

Practice by Chapter

Want unlimited practice?