A study seeks to investigate the therapeutic efficacy of treating asymptomatic subclinical hypothyroidism in preventing symptoms of hypothyroidism. The investigators found 300 asymptomatic patients with subclinical hypothyroidism, defined as serum thyroid-stimulating hormone (TSH) of 5 to 10 μU/mL with normal serum thyroxine (T4) levels. The patients were randomized to either thyroxine 75 μg daily or placebo. Both investigators and study subjects were blinded. Baseline patient characteristics were distributed similarly in the treatment and control group (p > 0.05). Participants' serum T4 and TSH levels and subjective quality of life were evaluated at a 3-week follow-up. No difference was found between the treatment and placebo groups. Which of the following is the most likely explanation for the results of this study?
Q2
In the study, all participants who were enrolled and randomly assigned to treatment with pulmharkimab were analyzed in the pulmharkimab group regardless of medication nonadherence or refusal of allocated treatment. A medical student reading the abstract is confused about why some participants assigned to pulmharkimab who did not adhere to the regimen were still analyzed as part of the pulmharkimab group. Which of the following best reflects the purpose of such an analysis strategy?
Q3
A randomized control double-blind study is conducted on the efficacy of 2 sulfonylureas. The study concluded that medication 1 was more efficacious in lowering fasting blood glucose than medication 2 (p ≤ 0.05; 95% CI: 14 [10-21]). Which of the following is true regarding a 95% confidence interval (CI)?
Q4
A randomized controlled trial is conducted investigating the effects of different diagnostic imaging modalities on breast cancer mortality. 8,000 women are randomized to receive either conventional mammography or conventional mammography with breast MRI. The primary outcome is survival from the time of breast cancer diagnosis. The conventional mammography group has a median survival after diagnosis of 17.0 years. The MRI plus conventional mammography group has a median survival of 19.5 years. If this difference is statistically significant, which form of bias may be affecting the results?
Q5
A group of bariatric surgeons are investigating a novel surgically-placed tube that drains a portion of the stomach following each meal. They are interested in studying its efficacy in facilitating weight loss in obese adults with BMIs > 40 kg/m2 who have failed to lose weight through non-surgical options. After randomizing 150 patients to undergoing the surgical tube procedure and 150 patients to non-surgical weight loss options (e.g., diet, exercise), the surgeons found that, on average, participants in the surgical treatment group lost 15% of their total body weight in comparison to 4% in the non-surgical group. Which of the following statistical tests is an appropriate initial test to evaluate if this difference in weight loss between the two groups is statistically significant?
RCTs US Medical PG Practice Questions and MCQs
Question 1: A study seeks to investigate the therapeutic efficacy of treating asymptomatic subclinical hypothyroidism in preventing symptoms of hypothyroidism. The investigators found 300 asymptomatic patients with subclinical hypothyroidism, defined as serum thyroid-stimulating hormone (TSH) of 5 to 10 μU/mL with normal serum thyroxine (T4) levels. The patients were randomized to either thyroxine 75 μg daily or placebo. Both investigators and study subjects were blinded. Baseline patient characteristics were distributed similarly in the treatment and control group (p > 0.05). Participants' serum T4 and TSH levels and subjective quality of life were evaluated at a 3-week follow-up. No difference was found between the treatment and placebo groups. Which of the following is the most likely explanation for the results of this study?
A. Observer effect
B. Berkson bias
C. Latency period (Correct Answer)
D. Confounding bias
E. Lead-time bias
Explanation: ***Latency period***
- A **latency period** refers to the time between exposure to a cause (e.g., treatment) and the manifestation of its effects (e.g., symptom improvement). The study's **3-week follow-up is too short** to observe the therapeutic benefits of thyroxine in subclinical hypothyroidism.
- Levothyroxine (T4) has a **half-life of approximately 7 days**, and it typically takes **6-8 weeks or longer** for steady-state levels to be achieved and for clinical symptoms to improve. The slow onset of action for thyroid hormone replacement and the gradual nature of symptom resolution mean a longer observation period (typically 3-6 months) is needed to assess efficacy in hypothyroidism.
- The null results likely reflect insufficient follow-up time rather than lack of treatment effect.
*Observer effect*
- The **observer effect**, or Hawthorne effect, occurs when subjects change their behavior because they know they are being observed. This study used **double-blinding** (both investigators and subjects), which effectively minimizes the observer effect.
- The primary issue here is the lack of observed therapeutic effect due to timing, not a change in behavior due to observation.
*Berkson bias*
- **Berkson bias** is a form of selection bias that arises in case-control studies conducted in hospitals, where the probability of being admitted to the hospital can be affected by both exposure and disease.
- This study is a **randomized controlled trial**, not a case-control study, and the selection of participants does not illustrate this specific bias.
*Confounding bias*
- **Confounding bias** occurs when an extraneous variable is associated with both the exposure and the outcome, distorting the observed relationship. The study states that **baseline patient characteristics were similarly distributed (p > 0.05)**, indicating successful randomization and minimization of confounding.
- While confounding is a common concern in observational studies, the RCT design and reported baseline similarities make it unlikely to be the primary explanation for the null results compared to an insufficient follow-up period.
*Lead-time bias*
- **Lead-time bias** is a form of detection bias where early detection of a disease through screening appears to prolong survival, even if the treatment does not change the course of the disease.
- This study is evaluating the **efficacy of treatment** in asymptomatic individuals with subclinical hypothyroidism, not the effect of screening on survival, making lead-time bias irrelevant to these results.
Question 2: In the study, all participants who were enrolled and randomly assigned to treatment with pulmharkimab were analyzed in the pulmharkimab group regardless of medication nonadherence or refusal of allocated treatment. A medical student reading the abstract is confused about why some participants assigned to pulmharkimab who did not adhere to the regimen were still analyzed as part of the pulmharkimab group. Which of the following best reflects the purpose of such an analysis strategy?
A. To minimize type 2 errors
B. To assess treatment efficacy more accurately
C. To reduce selection bias (Correct Answer)
D. To increase internal validity of study
E. To increase sample size
Explanation: ***To reduce selection bias***
- Analyzing participants in their originally assigned groups, regardless of adherence, is known as **intention-to-treat (ITT) analysis**.
- This method helps **preserve randomization** and minimizes **selection bias** that could arise if participants who did not adhere to treatment were excluded or re-assigned.
- **This is the most direct and specific purpose** of ITT analysis - preventing systematic differences between groups caused by post-randomization exclusions.
*To minimize type 2 errors*
- While ITT analysis affects statistical power, its primary purpose is not specifically to minimize **type 2 errors** (false negatives).
- ITT analysis may sometimes *increase* the likelihood of a type 2 error by diluting the treatment effect due to non-adherence.
*To assess treatment efficacy more accurately*
- ITT analysis assesses the **effectiveness** of *assigning* a treatment in a real-world setting, rather than the pure biological **efficacy** of the treatment itself.
- Efficacy is better assessed by a **per-protocol analysis**, which only includes compliant participants.
- ITT provides a more **conservative** and **pragmatic** estimate of treatment effect.
*To increase internal validity of study*
- While ITT analysis does contribute to **internal validity** by maintaining randomization, this is a **broader, secondary benefit** rather than the primary purpose.
- Internal validity encompasses many aspects of study design; ITT specifically addresses **post-randomization bias prevention**.
- The more precise answer is that ITT reduces **selection bias**, which is one specific threat to internal validity.
- Many other design features also contribute to internal validity (blinding, standardized protocols, etc.), making this option less specific.
*To increase sample size*
- ITT analysis includes all randomized participants, so it maintains the initial **sample size** that was randomized.
- However, the primary purpose is to preserve the integrity of randomization and prevent bias, not simply to increase the number of participants in the final analysis.
Question 3: A randomized control double-blind study is conducted on the efficacy of 2 sulfonylureas. The study concluded that medication 1 was more efficacious in lowering fasting blood glucose than medication 2 (p ≤ 0.05; 95% CI: 14 [10-21]). Which of the following is true regarding a 95% confidence interval (CI)?
A. If the same study were repeated multiple times, approximately 95% of the calculated confidence intervals would contain the true population parameter. (Correct Answer)
B. The 95% confidence interval is the probability chosen by the researcher to be the threshold of statistical significance.
C. When a 95% CI for the estimated difference between groups contains the value ‘0’, the results are significant.
D. It represents the probability that chance would not produce the difference shown, 95% of the time.
E. The study is adequately powered at the 95% confidence interval.
Explanation: ***If the same study were repeated multiple times, approximately 95% of the calculated confidence intervals would contain the true population parameter.***
- This statement accurately defines the **frequentist interpretation** of a confidence interval (CI). It reflects the long-run behavior of the CI over hypothetical repetitions of the study.
- A 95% CI means that if you were to repeat the experiment many times, 95% of the CIs calculated from those experiments would capture the **true underlying population parameter**.
*The 95% confidence interval is the probability chosen by the researcher to be the threshold of statistical significance.*
- The **alpha level (α)**, typically set at 0.05 (or 5%), is the threshold for statistical significance (p ≤ 0.05), representing the probability of a Type I error.
- The 95% confidence level (1-α) is related to statistical significance, but it is not the *threshold* itself; rather, it indicates the **reliability** of the interval estimate.
*When a 95% CI for the estimated difference between groups contains the value ‘0’, the results are significant.*
- If a 95% CI for the difference between groups **contains 0**, it implies that there is **no statistically significant difference** between the groups at the 0.05 alpha level.
- A statistically significant difference (p ≤ 0.05) would be indicated if the 95% CI **does NOT contain 0**, suggesting that the intervention had a real effect.
*It represents the probability that chance would not produce the difference shown, 95% of the time.*
- This statement misinterprets the meaning of a CI and probability. The chance of not producing the observed difference is typically addressed by the **p-value**, not directly by the CI in this manner.
- A CI provides a **range of plausible values** for the population parameter, not a probability about the role of chance in producing the observed difference.
*The study is adequately powered at the 95% confidence interval.*
- **Statistical power** is the probability of correctly rejecting a false null hypothesis, typically set at 80% or 90%. It is primarily determined by sample size, effect size, and alpha level.
- A 95% CI is a measure of the **precision** of an estimate, while power refers to the **ability of a study to detect an effect** if one exists. They are related but distinct concepts.
Question 4: A randomized controlled trial is conducted investigating the effects of different diagnostic imaging modalities on breast cancer mortality. 8,000 women are randomized to receive either conventional mammography or conventional mammography with breast MRI. The primary outcome is survival from the time of breast cancer diagnosis. The conventional mammography group has a median survival after diagnosis of 17.0 years. The MRI plus conventional mammography group has a median survival of 19.5 years. If this difference is statistically significant, which form of bias may be affecting the results?
A. Recall bias
B. Selection bias
C. Misclassification bias
D. Because this study is a randomized controlled trial, it is free of bias
E. Lead-time bias (Correct Answer)
Explanation: ***Lead-time bias***
- This bias occurs when a screening test diagnoses a disease earlier, making **survival appear longer** even if the actual time of death is unchanged.
- In this scenario, adding **MRI** may detect breast cancer at an earlier, asymptomatic stage, artificially extending the apparent survival duration from diagnosis without necessarily changing the ultimate prognosis.
*Recall bias*
- **Recall bias** applies to retrospective studies where subjects are asked to recall past exposures, and those with the outcome are more likely to remember potential exposures.
- It's irrelevant here as this is a **prospective randomized controlled trial** studying objective survival outcomes, not subjective past recollections.
*Selection bias*
- **Selection bias** occurs when participants are not randomly assigned to groups, leading to systematic differences between the groups influencing the outcome.
- This study is a **randomized controlled trial**, which is designed to minimize selection bias by ensuring participants have an equal chance of being assigned to either treatment arm.
*Misclassification bias*
- **Misclassification bias** happens when either the exposure or the outcome is incorrectly categorized, leading to erroneous associations.
- This study uses objective diagnostic imaging and survival data, thus reducing the likelihood of **misclassification of diagnosis or survival status**.
*Because this study is a randomized controlled trial, it is free of bias*
- While **randomized controlled trials (RCTs)** are considered the **gold standard** for minimizing bias, they are not entirely immune to all forms of bias.
- **Lead-time bias**, for instance, can still occur in RCTs involving screening or early diagnosis, as seen in this example, and other biases like **information bias** or **reporting bias** can also arise.
Question 5: A group of bariatric surgeons are investigating a novel surgically-placed tube that drains a portion of the stomach following each meal. They are interested in studying its efficacy in facilitating weight loss in obese adults with BMIs > 40 kg/m2 who have failed to lose weight through non-surgical options. After randomizing 150 patients to undergoing the surgical tube procedure and 150 patients to non-surgical weight loss options (e.g., diet, exercise), the surgeons found that, on average, participants in the surgical treatment group lost 15% of their total body weight in comparison to 4% in the non-surgical group. Which of the following statistical tests is an appropriate initial test to evaluate if this difference in weight loss between the two groups is statistically significant?
A. Kaplan-Meier analysis
B. Paired two-sample t-test
C. Multiple linear regression
D. Pearson correlation coefficient
E. Unpaired two-sample t-test (Correct Answer)
Explanation: ***Unpaired two-sample t-test***
- The goal is to compare the **means of two independent groups** (surgical vs. non-surgical) on a continuous outcome (percentage of weight loss).
- An unpaired t-test is ideal for determining if the **observed difference between these two group means** is statistically significant.
*Kaplan-Meier analysis*
- This analysis is used to estimate and compare **survival curves** or time-to-event data between groups.
- It is not suitable for comparing the **mean weight loss** between two independent groups.
*Paired two-sample t-test*
- A paired t-test is used when comparing two measurements from the **same individuals** or **matched pairs**.
- Here, the two groups are distinct and independent, not paired in any way.
*Multiple linear regression*
- This is used to model the **relationship between a dependent variable** and **two or more independent variables**.
- While useful for predicting weight loss based on multiple factors, it's not the most direct or initial test for simply comparing the mean weight loss between two groups.
*Pearson correlation coefficient*
- The Pearson correlation coefficient measures the **strength and direction of a linear relationship between two continuous variables**.
- It does not compare the means of two independent groups, but rather assesses the **degree to which two variables change together**.