Sample size for non-inferiority trials US Medical PG Practice Questions and MCQs
Practice US Medical PG questions for Sample size for non-inferiority trials. These multiple choice questions (MCQs) cover important concepts and help you prepare for your exams.
Sample size for non-inferiority trials US Medical PG Question 1: A study is funded by the tobacco industry to examine the association between smoking and lung cancer. They design a study with a prospective cohort of 1,000 smokers between the ages of 20-30. The length of the study is five years. After the study period ends, they conclude that there is no relationship between smoking and lung cancer. Which of the following study features is the most likely reason for the failure of the study to note an association between tobacco use and cancer?
- A. Late-look bias
- B. Latency period (Correct Answer)
- C. Confounding
- D. Effect modification
- E. Pygmalion effect
Sample size for non-inferiority trials Explanation: ***Latency period***
- **Lung cancer** typically has a **long latency period**, often **20-30+ years**, between initial exposure to tobacco carcinogens and the development of clinically detectable disease.
- A **five-year study duration** in young smokers (ages 20-30) is **far too short** to observe the development of lung cancer, which explains the false negative finding.
- This represents a **fundamental flaw in study design** rather than a bias—the biological timeline of disease development was not adequately considered.
*Late-look bias*
- **Late-look bias** occurs when a study enrolls participants who have already survived the early high-risk period of a disease, leading to **underestimation of true mortality or incidence**.
- Also called **survival bias**, it involves studying a population that has already been "selected" by survival.
- This is not applicable here, as the study simply ended before sufficient time elapsed for disease to develop.
*Confounding*
- **Confounding** occurs when a third variable is associated with both the exposure and outcome, distorting the apparent relationship between them.
- While confounding can affect study results, it would not completely eliminate the detection of a strong, well-established association like smoking and lung cancer in a properly conducted prospective cohort study.
- The issue here is temporal (insufficient follow-up time), not the presence of an unmeasured confounder.
*Effect modification*
- **Effect modification** (also called interaction) occurs when the magnitude of an association between exposure and outcome differs across levels of a third variable.
- This represents a **true biological phenomenon**, not a study design flaw or bias.
- It would not explain the complete failure to detect any association.
*Pygmalion effect*
- The **Pygmalion effect** (observer-expectancy effect) refers to a psychological phenomenon where higher expectations lead to improved performance in the observed subjects.
- This concept is relevant to **behavioral and educational research**, not to objective epidemiological studies of disease incidence.
- It has no relevance to the biological relationship between carcinogen exposure and cancer development.
Sample size for non-inferiority trials US Medical PG Question 2: Researchers are studying the effects of a new medication for the treatment of type 2 diabetes. A randomized group of 100 subjects is given the new medication 1st for 2 months, followed by a washout period of 2 weeks, and then administration of the gold standard medication for 2 months. Another randomized group of 100 subjects is given the gold standard medication 1st for 2 months, followed by a washout period of 2 weeks, and then administration of the new medication for 2 months. What is the main disadvantage of this study design?
- A. Hawthorne effect
- B. Increasing selection bias
- C. Increasing confounding bias
- D. Decreasing power
- E. Carryover effect (Correct Answer)
Sample size for non-inferiority trials Explanation: ***Carryover effect***
- The primary disadvantage here is the **carryover effect**, where the effects of the first treatment (new medication or gold standard) may persist into the period when the second treatment is administered, even after a washout period.
- This can **mask or alter the true effect** of the second treatment, making it difficult to accurately assess their individual efficacy.
*Hawthorne effect*
- The **Hawthorne effect** refers to subjects improving their behavior or performance in response to being observed or studied, not specifically an issue with sequential treatment administration.
- It would affect both groups equally and doesn't explain a disadvantage inherent to the crossover design itself.
*Increasing selection bias*
- **Selection bias** occurs when the randomization process fails to create comparable groups, but this study design involves **randomization** into two groups, and then a crossover, which typically aims to *reduce* selection bias by having each participant serve as their own control.
- The sequential administration within a randomized crossover design actually helps to mitigate selection bias between treatment arms.
*Increasing confounding bias*
- **Confounding bias** occurs when an unmeasured variable is associated with both the exposure and the outcome, distorting the observed relationship.
- This crossover design, where each participant receives both treatments, is intended to *reduce* confounding by inter-individual variability, as each subject acts as their own control, rather than increasing it.
*Decreasing power*
- **Power** is the ability of a study to detect a true effect if one exists. Crossover designs often *increase* statistical power compared to parallel designs because each participant receives both treatments, reducing inter-individual variability.
- This design typically requires a smaller sample size to achieve the same power as a parallel group study, so decreased power is not a disadvantage.
Sample size for non-inferiority trials US Medical PG Question 3: A research team develops a new monoclonal antibody checkpoint inhibitor for advanced melanoma that has shown promise in animal studies as well as high efficacy and low toxicity in early phase human clinical trials. The research team would now like to compare this drug to existing standard of care immunotherapy for advanced melanoma. The research team decides to conduct a non-randomized study where the novel drug will be offered to patients who are deemed to be at risk for toxicity with the current standard of care immunotherapy, while patients without such risk factors will receive the standard treatment. Which of the following best describes the level of evidence that this study can offer?
- A. Level 1
- B. Level 3 (Correct Answer)
- C. Level 5
- D. Level 4
- E. Level 2
Sample size for non-inferiority trials Explanation: ***Level 3***
- A **non-randomized controlled trial** like the one described, where patient assignment to treatment groups is based on specific characteristics (risk of toxicity), falls into Level 3 evidence.
- This level typically includes **non-randomized controlled trials** and **well-designed cohort studies** with comparison groups, which are prone to selection bias and confounding.
- The study compares two treatments but lacks randomization, making it Level 3 evidence.
*Level 1*
- Level 1 evidence is the **highest level of evidence**, derived from **systematic reviews and meta-analyses** of multiple well-designed randomized controlled trials or large, high-quality randomized controlled trials.
- The described study is explicitly stated as non-randomized, ruling out Level 1.
*Level 2*
- Level 2 evidence involves at least one **well-designed randomized controlled trial** (RCT) or **systematic reviews** of randomized trials.
- The current study is *non-randomized*, which means it cannot be classified as Level 2 evidence, as randomization is a key criterion for this level.
*Level 4*
- Level 4 evidence includes **case series**, **case-control studies**, and **poorly designed cohort or case-control studies**.
- While the study is non-randomized, it is a controlled comparative trial rather than a case series or retrospective case-control study, placing it at Level 3.
*Level 5*
- Level 5 evidence is the **lowest level of evidence**, typically consisting of **expert opinion** without explicit critical appraisal, or based on physiology, bench research, or animal studies.
- While the drug was initially tested in animal studies, the current human comparative study offers a higher level of evidence than expert opinion or preclinical data.
Sample size for non-inferiority trials US Medical PG Question 4: A study is being conducted on depression using the Patient Health questionnaire (PHQ-9) survey data embedded within a popular social media network with a response size of 500,000 participants. The sample population of this study is approximately normal. The mean PHQ-9 score is 14, and the standard deviation is 4. How many participants have scores greater than 22?
- A. 175,000
- B. 17,500
- C. 160,000
- D. 12,500 (Correct Answer)
- E. 25,000
Sample size for non-inferiority trials Explanation: ***12,500***
- To find the number of participants with scores greater than 22, first calculate the **z-score** for a score of 22: $Z = \frac{(X - \mu)}{\sigma} = \frac{(22 - 14)}{4} = 2$.
- A z-score of 2 means the score is **2 standard deviations above the mean**. Using the **empirical rule** for a normal distribution, approximately **2.5%** of the data falls beyond 2 standard deviations above the mean (5% total in both tails, so 2.5% in each tail).
- Therefore, $2.5\%$ of the total 500,000 participants is $0.025 \times 500,000 = 12,500$.
*175,000*
- This option would imply a much larger proportion of the population scoring above 22, inconsistent with the **normal distribution's properties** and the calculated z-score.
- It would correspond to a z-score closer to 0, indicating a score closer to the mean, not two standard deviations above it.
*17,500*
- This value represents **3.5%** of the total population ($17,500 / 500,000 = 0.035$).
- A proportion of 3.5% above the mean corresponds to a z-score that is not exactly 2, indicating an incorrect calculation or interpretation of the **normal distribution table**.
*160,000*
- This option represents a very large portion of the participants, roughly **32%** of the total population.
- This percentage would correspond to scores within one standard deviation of the mean, not scores 2 standard deviations above the mean as calculated.
*25,000*
- This value represents **5%** of the total population ($25,000 / 500,000 = 0.05$).
- A z-score greater than 2 corresponds to the far tail of the normal distribution, where only 2.5% of the data lies, not 5%. This would correspond to a z-score of approximately 1.65.
Sample size for non-inferiority trials US Medical PG Question 5: An academic medical center in the United States is approached by a pharmaceutical company to run a small clinical trial to test the effectiveness of its new drug, compound X. The company wants to know if the measured hemoglobin a1c (Hba1c) of patients with type 2 diabetes receiving metformin and compound X would be lower than that of control subjects receiving only metformin. After a year of study and data analysis, researchers conclude that the control and treatment groups did not differ significantly in their Hba1c levels.
However, parallel clinical trials in several other countries found that compound X led to a significant decrease in Hba1c. Interested in the discrepancy between these findings, the company funded a larger study in the United States, which confirmed that compound X decreased Hba1c levels. After compound X was approved by the FDA, and after several years of use in the general population, outcomes data confirmed that it effectively lowered Hba1c levels and increased overall survival. What term best describes the discrepant findings in the initial clinical trial run by institution A?
- A. Type I error
- B. Hawthorne effect
- C. Type II error (Correct Answer)
- D. Publication bias
- E. Confirmation bias
Sample size for non-inferiority trials Explanation: ***Type II error***
- A **Type II error** occurs when a study fails to **reject a false null hypothesis**, meaning it concludes there is no significant difference or effect when one actually exists.
- In this case, the initial US trial incorrectly concluded that Compound X had no significant effect on HbA1c, while subsequent larger studies and real-world data proved it did.
*Type I error*
- A **Type I error** (alpha error) occurs when a study incorrectly **rejects a true null hypothesis**, concluding there is a significant difference or effect when there isn't.
- This scenario describes the opposite: the initial study failed to find an effect that genuinely existed, indicating a Type II error, not a Type I error.
*Hawthorne effect*
- The **Hawthorne effect** is a type of reactivity in which individuals modify an aspect of their behavior in response to their awareness of being observed.
- This effect does not explain the initial trial's failure to detect a real drug effect; rather, it relates to participants changing behavior due to study participation itself.
*Publication bias*
- **Publication bias** occurs when studies with positive or statistically significant results are more likely to be published than those with negative or non-significant results.
- While relevant to the literature as a whole, it doesn't explain the discrepancy in findings within a single drug's development where a real effect was initially missed.
*Confirmation bias*
- **Confirmation bias** is the tendency to search for, interpret, favor, and recall information in a way that confirms one's preexisting beliefs or hypotheses.
- This bias would likely lead researchers to *find* an effect if they expected one, or to disregard data that contradicts their beliefs, which is not what happened in the initial trial.
Sample size for non-inferiority trials US Medical PG Question 6: You are currently employed as a clinical researcher working on clinical trials of a new drug to be used for the treatment of Parkinson's disease. Currently, you have already determined the safe clinical dose of the drug in a healthy patient. You are in the phase of drug development where the drug is studied in patients with the target disease to determine its efficacy. Which of the following phases is this new drug currently in?
- A. Phase 4
- B. Phase 1
- C. Phase 2 (Correct Answer)
- D. Phase 0
- E. Phase 3
Sample size for non-inferiority trials Explanation: ***Phase 2***
- **Phase 2 trials** involve studying the drug in patients with the target disease to assess its **efficacy** and further evaluate safety, typically involving a few hundred patients.
- The question describes a stage after safe dosing in healthy patients (Phase 1) and before large-scale efficacy confirmation (Phase 3), focusing on efficacy in the target population.
*Phase 4*
- **Phase 4 trials** occur **after a drug has been approved** and marketed, monitoring long-term effects, optimal use, and rare side effects in a diverse patient population.
- This phase is conducted post-market approval, whereas the question describes a drug still in development prior to approval.
*Phase 1*
- **Phase 1 trials** primarily focus on determining the **safety and dosage** of a new drug in a **small group of healthy volunteers** (or sometimes patients with advanced disease if the drug is highly toxic).
- The question states that the safe clinical dose in a healthy patient has already been determined, indicating that Phase 1 has been completed.
*Phase 0*
- **Phase 0 trials** are exploratory, very early-stage studies designed to confirm that the drug reaches the target and acts as intended, typically involving a very small number of doses and participants.
- These trials are conducted much earlier in the development process, preceding the determination of safe clinical doses and large-scale efficacy studies.
*Phase 3*
- **Phase 3 trials** are large-scale studies involving hundreds to thousands of patients to confirm **efficacy**, monitor side effects, compare it to commonly used treatments, and collect information that will allow the drug to be used safely.
- While Phase 3 does assess efficacy, it follows Phase 2 and is typically conducted on a much larger scale before submitting for regulatory approval.
Sample size for non-inferiority trials US Medical PG Question 7: The height of American adults is expected to follow a normal distribution, with a typical male adult having an average height of 69 inches with a standard deviation of 0.1 inches. An investigator has been informed about a community in the American Midwest with a history of heavy air and water pollution in which a lower mean height has been reported. The investigator plans to sample 30 male residents to test the claim that heights in this town differ significantly from the national average based on heights assumed be normally distributed. The significance level is set at 10% and the probability of a type 2 error is assumed to be 15%. Based on this information, which of the following is the power of the proposed study?
- A. 0.10
- B. 0.85 (Correct Answer)
- C. 0.90
- D. 0.15
- E. 0.05
Sample size for non-inferiority trials Explanation: ***0.85***
- **Power** is defined as **1 - β**, where β is the **probability of a Type II error**.
- Given that the probability of a **Type II error (β)** is 15% or 0.15, the power of the study is 1 - 0.15 = **0.85**.
*0.10*
- This value represents the **significance level (α)**, which is the probability of committing a **Type I error** (rejecting a true null hypothesis).
- The significance level is distinct from the **power of the study**, which relates to Type II errors.
*0.90*
- This value would be the power if the **Type II error rate (β)** was 0.10 (1 - 0.10 = 0.90), but the question specifies a β of 0.15.
- It is also the complement of the significance level (1 - α), which is not the definition of power.
*0.15*
- This value is the **probability of a Type II error (β)**, not the power of the study.
- **Power** is the probability of correctly rejecting a false null hypothesis, which is 1 - β.
*0.05*
- While 0.05 is a common significance level (α), it is not given as the significance level in this question (which is 0.10).
- This value also does not represent the power of the study, which would be calculated using the **Type II error rate**.
Sample size for non-inferiority trials US Medical PG Question 8: The mean, median, and mode weight of 37 newborns in a hospital nursery is 7 lbs 2 oz. In fact, there are 7 infants in the nursery that weigh exactly 7 lbs 2 oz. The standard deviation of the weights is 2 oz. The weights follow a normal distribution. A newborn delivered at 10 lbs 2 oz is added to the data set. What is most likely to happen to the mean, median, and mode with the addition of this new data point?
- A. The mean will increase; the median will increase; the mode will stay the same
- B. The mean will increase; the median will stay the same; the mode will stay the same (Correct Answer)
- C. The mean will stay the same; the median will increase; the mode will stay the same
- D. The mean will increase; the median will increase; the mode will increase
- E. The mean will stay the same; the median will increase; the mode will increase
Sample size for non-inferiority trials Explanation: ***The mean will increase; the median will stay the same; the mode will stay the same***
- The **mean** is highly sensitive to outliers. Adding a newborn weighing 10 lbs 2 oz (significantly heavier than the original mean of 7 lbs 2 oz) will increase the total sum of weights, thus **increasing the mean**.
- The **median** is the middle value in an ordered dataset. With 37 newborns, the median is the 19th value. Adding one more (38 total) makes the median the average of the 19th and 20th values. Since the new value (10 lbs 2 oz) is added at the extreme high end of the distribution, the 19th and 20th positions contain the same values as before. Therefore, the median will **stay the same**.
- The **mode** is the most frequent value. Since there are 7 infants already at 7 lbs 2 oz, adding a single infant at 10 lbs 2 oz will not change the most frequent weight in the dataset. The mode will **stay the same** at 7 lbs 2 oz.
*The mean will increase; the median will increase; the mode will stay the same*
- While the **mean will increase** due to the added outlier, the **median will not change**. With 38 observations, the median becomes the average of the 19th and 20th values, which remain unchanged since the outlier is added at position 38.
- The **mode** correctly stays at 7 lbs 2 oz as the new data point does not become the most frequent value.
*The mean will stay the same; the median will increase; the mode will stay the same*
- The **mean will not stay the same** because an outlier significantly higher than the current mean will always pull the mean higher.
- The **median will also not increase** as the middle values (19th and 20th positions) remain unchanged when adding an extreme outlier.
*The mean will increase; the median will increase; the mode will increase*
- While the **mean will increase**, the **median will not change** because the middle positions are unaffected by adding one extreme outlier.
- The **mode will not change** as the new data point (10 lbs 2 oz) is unique and doesn't become the most frequent value; 7 lbs 2 oz remains most frequent with 7 occurrences.
*The mean will stay the same; the median will increase; the mode will increase*
- This option is incorrect because the **mean will definitely increase** with the addition of a much larger value.
- The **median will not increase** as it depends on the middle positions, not extreme values.
- The **mode will not increase** as adding one 10 lb 2 oz infant won't make that weight the most frequent.
Sample size for non-inferiority trials US Medical PG Question 9: A health system implements a new sepsis protocol across 20 hospitals. A researcher plans to evaluate effectiveness using a stepped-wedge cluster randomized design where hospitals sequentially adopt the protocol every 3 months. She calculates sample size based on individual patient outcomes (mortality) needing 2,000 patients total. The biostatistician identifies a critical error. Evaluate what modification is needed.
- A. Adjust for multiple time periods using Bonferroni correction
- B. Use hospital-level outcomes instead of patient-level outcomes as unit of analysis
- C. Increase alpha to 0.10 to account for cluster randomization reducing power
- D. Include random effects for both hospital and time period in power calculation
- E. Account for intra-cluster correlation coefficient (ICC) requiring substantial sample size inflation (Correct Answer)
Sample size for non-inferiority trials Explanation: ***Account for intra-cluster correlation coefficient (ICC) requiring substantial sample size inflation***
- In cluster-randomized designs, observations within the same cluster (hospital) are not independent; the **Intra-cluster Correlation Coefficient (ICC)** quantifies this correlation and must be used to calculate a **design effect**.
- Neglecting the ICC leads to an **underpowered study** because the effective sample size is smaller than the total number of individual patients measured.
*Adjust for multiple time periods using Bonferroni correction*
- **Bonferroni correction** is used to control for **Type I error** when performing multiple independent hypothesis tests, not for determining sample size in nested longitudinal designs.
- While the stepped-wedge design involves multiple time points, the primary analysis typically uses a **single model** (e.g., GEE or GLMM) that accounts for time as a fixed effect.
*Use hospital-level outcomes instead of patient-level outcomes as unit of analysis*
- While the hospital is the **unit of randomization**, using hospital-level means as the unit of analysis simplifies the data and causes a significant loss of **statistical information** and precision.
- Modern biostatistical methods utilize **multilevel modeling** to maintain the richness of patient-level data while adjusting for the cluster-level randomization.
*Include random effects for both hospital and time period in power calculation*
- While random effects are important for the **analysis phase**, the "critical error" identified in the prompt refers to the initial failure to inflate the sample size based on **clustering (ICC)**.
- Power calculations for stepped-wedge designs are complex and certainly involve time parameters, but **ICC-based inflation** is the most fundamental adjustment required when moving from individual to cluster randomization.
*Increase alpha to 0.10 to account for cluster randomization reducing power*
- Increasing the **alpha level** (significance threshold) is not a standard or scientifically acceptable method to compensate for the loss of power due to **clustering**.
- Standard practice mandates maintaining an **alpha of 0.05** while appropriately increasing the **sample size** or number of clusters to reach the desired power (usually 80-90%).
Sample size for non-inferiority trials US Medical PG Question 10: A 41-year-old research fellow designs a non-inferiority trial comparing oral to IV antibiotics for osteomyelitis. She sets the non-inferiority margin at 10% (cure rate difference), expects 85% cure in both groups, and calculates 300 patients per arm for 80% power with α=0.025 (one-sided). Her mentor suggests this underestimates required sample size. Evaluate the mentor's concern.
- A. Correct; non-inferiority trials require larger samples than superiority trials for equivalent power (Correct Answer)
- B. Incorrect; non-inferiority trials actually require smaller samples due to less stringent hypotheses
- C. Correct; dropout rates in antibiotic trials necessitate 20% inflation of calculated sample size
- D. Incorrect; the calculation appropriately uses one-sided alpha for non-inferiority testing
- E. Correct; the margin should be set at 5% requiring doubling of sample size
Sample size for non-inferiority trials Explanation: ***Correct; non-inferiority trials require larger samples than superiority trials for equivalent power***
- **Non-inferiority trials** are designed to exclude a difference greater than a pre-specified margin, which typically requires a **larger sample size** than superiority trials investigating the same outcome.
- Because we are proving that the new treatment is "not much worse" (rather than "better"), the **statistical threshold** often necessitates higher enrollment to achieve adequate **power**.
*Incorrect; the calculation appropriately uses one-sided alpha for non-inferiority testing*
- While it is true that **non-inferiority testing** uses a **one-sided alpha (0.025)**, this does not negate the fact that such trials inherently require more participants.
- The mentor's concern is about the **total N**, which remains insufficient despite using the correct one-sided alpha convention.
*Correct; the margin should be set at 5% requiring doubling of sample size*
- There is no universal rule that the **non-inferiority margin** must be 5%; it is determined by **clinical significance** and regulatory standards for the specific condition.
- While a 5% margin would indeed increase the sample size, the 10% margin is often standard in **antibiotic trials** for osteomyelitis.
*Incorrect; non-inferiority trials actually require smaller samples due to less stringent hypotheses*
- This is a common misconception; non-inferiority trials are actually more demanding because the **null hypothesis** assumes the treatments are different (inferior).
- Disproving **inferiority** within a tight **margin (delta)** is statistically more intensive than proving a treatment is superior to a placebo.
*Correct; dropout rates in antibiotic trials necessitate 20% inflation of calculated sample size*
- While **attrition bias** is a concern, there is no fixed rule that every trial needs a **20% inflation** factor.
- The mentor's concern is specifically about the **base calculation** and the statistical nature of non-inferiority designs rather than just the **dropout rate**.
More Sample size for non-inferiority trials US Medical PG questions available in the OnCourse app. Practice MCQs, flashcards, and get detailed explanations.