Relationship between CIs and hypothesis testing US Medical PG Practice Questions and MCQs
Practice US Medical PG questions for Relationship between CIs and hypothesis testing. These multiple choice questions (MCQs) cover important concepts and help you prepare for your exams.
Relationship between CIs and hypothesis testing US Medical PG Question 1: Group of 100 medical students took an end of the year exam. The mean score on the exam was 70%, with a standard deviation of 25%. The professor states that a student's score must be within the 95% confidence interval of the mean to pass the exam. Which of the following is the minimum score a student can have to pass the exam?
- A. 45%
- B. 63.75%
- C. 67.5%
- D. 20%
- E. 65% (Correct Answer)
Relationship between CIs and hypothesis testing Explanation: ***65%***
- To find the **95% confidence interval (CI) of the mean**, we use the formula: Mean ± (Z-score × Standard Error). For a 95% CI, the Z-score is approximately **1.96**.
- The **Standard Error (SE)** is calculated as SD/√n, where n is the sample size (100 students). So, SE = 25%/√100 = 25%/10 = **2.5%**.
- The 95% CI is 70% ± (1.96 × 2.5%) = 70% ± 4.9%. The lower bound is 70% - 4.9% = **65.1%**, which rounds to **65%** as the minimum passing score.
*45%*
- This value is significantly lower than the calculated lower bound of the 95% confidence interval (approximately 65.1%).
- It would represent a score far outside the defined passing range.
*63.75%*
- This value falls below the calculated lower bound of the 95% confidence interval (approximately 65.1%).
- While close, this score would not meet the professor's criterion for passing.
*67.5%*
- This value is within the 95% confidence interval (65.1% to 74.9%) but is **not the minimum score**.
- Lower scores within the interval would still qualify as passing.
*20%*
- This score is extremely low and falls significantly outside the 95% confidence interval for a mean of 70%.
- It would indicate performance far below the defined passing threshold.
Relationship between CIs and hypothesis testing US Medical PG Question 2: A randomized control double-blind study is conducted on the efficacy of 2 sulfonylureas. The study concluded that medication 1 was more efficacious in lowering fasting blood glucose than medication 2 (p ≤ 0.05; 95% CI: 14 [10-21]). Which of the following is true regarding a 95% confidence interval (CI)?
- A. If the same study were repeated multiple times, approximately 95% of the calculated confidence intervals would contain the true population parameter. (Correct Answer)
- B. The 95% confidence interval is the probability chosen by the researcher to be the threshold of statistical significance.
- C. When a 95% CI for the estimated difference between groups contains the value ‘0’, the results are significant.
- D. It represents the probability that chance would not produce the difference shown, 95% of the time.
- E. The study is adequately powered at the 95% confidence interval.
Relationship between CIs and hypothesis testing Explanation: ***If the same study were repeated multiple times, approximately 95% of the calculated confidence intervals would contain the true population parameter.***
- This statement accurately defines the **frequentist interpretation** of a confidence interval (CI). It reflects the long-run behavior of the CI over hypothetical repetitions of the study.
- A 95% CI means that if you were to repeat the experiment many times, 95% of the CIs calculated from those experiments would capture the **true underlying population parameter**.
*The 95% confidence interval is the probability chosen by the researcher to be the threshold of statistical significance.*
- The **alpha level (α)**, typically set at 0.05 (or 5%), is the threshold for statistical significance (p ≤ 0.05), representing the probability of a Type I error.
- The 95% confidence level (1-α) is related to statistical significance, but it is not the *threshold* itself; rather, it indicates the **reliability** of the interval estimate.
*When a 95% CI for the estimated difference between groups contains the value ‘0’, the results are significant.*
- If a 95% CI for the difference between groups **contains 0**, it implies that there is **no statistically significant difference** between the groups at the 0.05 alpha level.
- A statistically significant difference (p ≤ 0.05) would be indicated if the 95% CI **does NOT contain 0**, suggesting that the intervention had a real effect.
*It represents the probability that chance would not produce the difference shown, 95% of the time.*
- This statement misinterprets the meaning of a CI and probability. The chance of not producing the observed difference is typically addressed by the **p-value**, not directly by the CI in this manner.
- A CI provides a **range of plausible values** for the population parameter, not a probability about the role of chance in producing the observed difference.
*The study is adequately powered at the 95% confidence interval.*
- **Statistical power** is the probability of correctly rejecting a false null hypothesis, typically set at 80% or 90%. It is primarily determined by sample size, effect size, and alpha level.
- A 95% CI is a measure of the **precision** of an estimate, while power refers to the **ability of a study to detect an effect** if one exists. They are related but distinct concepts.
Relationship between CIs and hypothesis testing US Medical PG Question 3: A researcher is trying to determine whether a newly discovered substance X can be useful in promoting wound healing after surgery. She conducts this study by enrolling the next 100 patients that will be undergoing this surgery and separating them into 2 groups. She decides which patient will be in which group by using a random number generator. Subsequently, she prepares 1 set of syringes with the novel substance X and 1 set of syringes with a saline control. Both of these sets of syringes are unlabeled and the substances inside cannot be distinguished. She gives the surgeon performing the surgery 1 of the syringes and does not inform him nor the patient which syringe was used. After the study is complete, she analyzes all the data that was collected and performs statistical analysis. This study most likely provides which level of evidence for use of substance X?
- A. Level 3
- B. Level 1 (Correct Answer)
- C. Level 4
- D. Level 5
- E. Level 2
Relationship between CIs and hypothesis testing Explanation: ***Level 1***
- The study design described is a **randomized controlled trial (RCT)**, which is considered the **highest level of evidence (Level 1)** in the hierarchy of medical evidence.
- Key features like **randomization**, **control group**, and **blinding (double-blind)** help minimize bias and strengthen the validity of the findings.
*Level 2*
- Level 2 evidence typically comprises **well-designed controlled trials without randomization** (non-randomized controlled trials) or **high-quality cohort studies**.
- While strong, they do not possess the same level of internal validity as randomized controlled trials.
*Level 3*
- Level 3 evidence typically includes **case-control studies** or **cohort studies**, which are observational designs and carry a higher risk of bias compared to RCTs.
- These studies generally do not involve randomization or intervention assignment by the researchers.
*Level 4*
- Level 4 evidence is usually derived from **case series** or **poor quality cohort and case-control studies**.
- These studies provide descriptive information or investigate associations without strong control for confounding factors.
*Level 5*
- Level 5 evidence is the **lowest level of evidence**, consisting of **expert opinion** or **animal research/bench research**.
- This level lacks human clinical data or systematic investigative rigor needed for higher evidence levels.
Relationship between CIs and hypothesis testing US Medical PG Question 4: You are reading through a recent article that reports significant decreases in all-cause mortality for patients with malignant melanoma following treatment with a novel biological infusion. Which of the following choices refers to the probability that a study will find a statistically significant difference when one truly does exist?
- A. Type II error
- B. Type I error
- C. Confidence interval
- D. p-value
- E. Power (Correct Answer)
Relationship between CIs and hypothesis testing Explanation: ***Power***
- **Power** is the probability that a study will correctly reject the null hypothesis when it is, in fact, false (i.e., will find a statistically significant difference when one truly exists).
- A study with high power minimizes the risk of a **Type II error** (failing to detect a real effect).
*Type II error*
- A **Type II error** (or **beta error**) occurs when a study fails to reject a false null hypothesis, meaning it concludes there is no significant difference when one actually exists.
- This is the **opposite** of what the question describes, which asks for the probability of *finding* a difference.
*Type I error*
- A **Type I error** (or **alpha error**) occurs when a study incorrectly rejects a true null hypothesis, concluding there is a significant difference when one does not actually exist.
- This relates to the **p-value** and the level of statistical significance (e.g., p < 0.05).
*Confidence interval*
- A **confidence interval** provides a range of values within which the true population parameter is likely to lie with a certain degree of confidence (e.g., 95%).
- It does not directly represent the probability of finding a statistically significant difference when one truly exists.
*p-value*
- The **p-value** is the probability of observing data as extreme as, or more extreme than, that obtained in the study, assuming the null hypothesis is true.
- It is used to determine statistical significance, but it is not the probability of detecting a true effect.
Relationship between CIs and hypothesis testing US Medical PG Question 5: A biostatistician is processing data for a large clinical trial she is working on. The study is analyzing the use of a novel pharmaceutical compound for the treatment of anorexia after chemotherapy with the outcome of interest being the change in weight while taking the drug. While most participants remained about the same weight or continued to lose weight while on chemotherapy, there were smaller groups of individuals who responded very positively to the orexic agent. As a result, the data had a strong positive skew. The biostatistician wishes to report the measures of central tendency for this project. Just by understanding the skew in the data, which of the following can be expected for this data set?
- A. Mean = median = mode
- B. Mean < median < mode
- C. Mean > median > mode (Correct Answer)
- D. Mean > median = mode
- E. Mean < median = mode
Relationship between CIs and hypothesis testing Explanation: ***Mean > median > mode***
- In a dataset with a **strong positive skew**, the tail of the distribution is on the right, pulled by a few **unusually large values**.
- These extreme high values disproportionately influence the **mean**, pulling it to the right (higher value), while the **median** (middle value) is less affected, and the **mode** (most frequent value) is often located at the peak of the distribution towards the left.
*Mean = median = mode*
- This relationship between the measures of central tendency is characteristic of a **perfectly symmetrical distribution**, such as a **normal distribution**, where there is no skew.
- In a symmetrical distribution, the mean, median, and mode are all located at the exact center of the data.
*Mean < median < mode*
- This order is typical for a dataset with a **negative skew**, where the tail is on the left due to a few **unusually small values**.
- In a negatively skewed distribution, the mean is pulled to the left (lower value) by the small values, making it less than the median and mode.
*Mean > median = mode*
- This configuration is generally not characteristic of standard skewed distributions and would imply a specific, less common bimodal or complex distribution shape where the mode coincides with the median, but the mean is pulled higher.
- While theoretically possible, it doesn't describe a typical positively skewed distribution where the mode is usually the lowest of the three.
*Mean < median = mode*
- This relationship would suggest a negatively skewed distribution where the median and mode are equal, but the mean is pulled to the left (lower value) by a leftward tail.
- Again, this is a less typical representation of a standard negatively skewed distribution, which often follows the Mean < Median < Mode pattern.
Relationship between CIs and hypothesis testing US Medical PG Question 6: A pharmaceutical corporation is developing a research study to evaluate a novel blood test to screen for breast cancer. They enrolled 800 patients in the study, half of which have breast cancer. The remaining enrolled patients are age-matched controls who do not have the disease. Of those in the diseased arm, 330 are found positive for the test. Of the patients in the control arm, only 30 are found positive. What is this test’s sensitivity?
- A. 330 / (330 + 30)
- B. 330 / (330 + 70) (Correct Answer)
- C. 370 / (30 + 370)
- D. 370 / (70 + 370)
- E. 330 / (400 + 400)
Relationship between CIs and hypothesis testing Explanation: ***330 / (330 + 70)***
- **Sensitivity** measures the proportion of actual **positives** that are correctly identified as such.
- In this study, there are **400 diseased patients** (half of 800). Of these, 330 tested positive (true positives), meaning 70 tested negative (false negatives). So sensitivity is **330 / (330 + 70)**.
*330 / (330 + 30)*
- This calculation represents the **positive predictive value**, which is the probability that subjects with a positive screening test truly have the disease. It uses **true positives / (true positives + false positives)**.
- It does not correctly calculate **sensitivity**, which requires knowing the total number of diseased individuals.
*370 / (30 + 370)*
- This expression is attempting to calculate **specificity**, which is the proportion of actual negatives that are correctly identified. It would be **true negatives / (true negatives + false positives)**.
- However, the numbers used are incorrect for specificity in this context given the data provided.
*370 / (70 + 370)*
- This formula is an incorrect combination of values and does not represent any standard epidemiological measure like **sensitivity** or **specificity**.
- It is attempting to combine false negatives (70) and true negatives (370 from control arm) in a non-standard way.
*330 / (400 + 400)*
- This calculation attempts to divide true positives by the total study population (800 patients).
- This metric represents the **prevalence of true positives within the entire study cohort**, not the test's **sensitivity**.
Relationship between CIs and hypothesis testing US Medical PG Question 7: The mean, median, and mode weight of 37 newborns in a hospital nursery is 7 lbs 2 oz. In fact, there are 7 infants in the nursery that weigh exactly 7 lbs 2 oz. The standard deviation of the weights is 2 oz. The weights follow a normal distribution. A newborn delivered at 10 lbs 2 oz is added to the data set. What is most likely to happen to the mean, median, and mode with the addition of this new data point?
- A. The mean will increase; the median will increase; the mode will stay the same
- B. The mean will increase; the median will stay the same; the mode will stay the same (Correct Answer)
- C. The mean will stay the same; the median will increase; the mode will stay the same
- D. The mean will increase; the median will increase; the mode will increase
- E. The mean will stay the same; the median will increase; the mode will increase
Relationship between CIs and hypothesis testing Explanation: ***The mean will increase; the median will stay the same; the mode will stay the same***
- The **mean** is highly sensitive to outliers. Adding a newborn weighing 10 lbs 2 oz (significantly heavier than the original mean of 7 lbs 2 oz) will increase the total sum of weights, thus **increasing the mean**.
- The **median** is the middle value in an ordered dataset. With 37 newborns, the median is the 19th value. Adding one more (38 total) makes the median the average of the 19th and 20th values. Since the new value (10 lbs 2 oz) is added at the extreme high end of the distribution, the 19th and 20th positions contain the same values as before. Therefore, the median will **stay the same**.
- The **mode** is the most frequent value. Since there are 7 infants already at 7 lbs 2 oz, adding a single infant at 10 lbs 2 oz will not change the most frequent weight in the dataset. The mode will **stay the same** at 7 lbs 2 oz.
*The mean will increase; the median will increase; the mode will stay the same*
- While the **mean will increase** due to the added outlier, the **median will not change**. With 38 observations, the median becomes the average of the 19th and 20th values, which remain unchanged since the outlier is added at position 38.
- The **mode** correctly stays at 7 lbs 2 oz as the new data point does not become the most frequent value.
*The mean will stay the same; the median will increase; the mode will stay the same*
- The **mean will not stay the same** because an outlier significantly higher than the current mean will always pull the mean higher.
- The **median will also not increase** as the middle values (19th and 20th positions) remain unchanged when adding an extreme outlier.
*The mean will increase; the median will increase; the mode will increase*
- While the **mean will increase**, the **median will not change** because the middle positions are unaffected by adding one extreme outlier.
- The **mode will not change** as the new data point (10 lbs 2 oz) is unique and doesn't become the most frequent value; 7 lbs 2 oz remains most frequent with 7 occurrences.
*The mean will stay the same; the median will increase; the mode will increase*
- This option is incorrect because the **mean will definitely increase** with the addition of a much larger value.
- The **median will not increase** as it depends on the middle positions, not extreme values.
- The **mode will not increase** as adding one 10 lb 2 oz infant won't make that weight the most frequent.
Relationship between CIs and hypothesis testing US Medical PG Question 8: Two research groups independently study the same genetic variant's association with diabetes. Study A (n=5,000) reports OR=1.25, 95% CI: 1.05-1.48, p=0.01. Study B (n=50,000) reports OR=1.08, 95% CI: 1.02-1.14, p=0.006. Both studies are methodologically sound. Synthesize these findings to determine the most likely true effect and evaluate implications for clinical and research interpretation.
- A. Study B is definitive because of its larger sample size and should replace Study A's findings
- B. The study with the lower p-value (Study B) is automatically more reliable
- C. The studies are contradictory and no conclusions can be drawn
- D. Study A is correct because it was published first
- E. The true effect is likely modest (closer to Study B's estimate); Study A likely overestimated due to smaller sample size, but both show statistical significance with clinically marginal effects (Correct Answer)
Relationship between CIs and hypothesis testing Explanation: ***The true effect is likely modest (closer to Study B's estimate); Study A likely overestimated due to smaller sample size, but both show statistical significance with clinically marginal effects***
- Study B has significantly higher **statistical power** and **precision** (narrower 95% CI) due to its larger sample size, making its **odds ratio (OR)** estimate more reliable.
- Smaller initial studies often exhibit the **Winner's Curse**, where effect sizes are **overestimated** to reach the threshold for statistical significance.
*Study A is correct because it was published first*
- **Publication order** does not determine the scientific validity or accuracy of genetic association studies.
- Early studies are more prone to **random error** and inflated effect sizes compared to later, larger-scale replications.
*Study B is definitive because of its larger sample size and should replace Study A's findings*
- While Study B is more **precise**, both studies are directionally consistent and both show **statistical significance** (p < 0.05).
- Scientific evidence is **cumulative**; Study B refines and confirms the existence of an association rather than declaring Study A's findings as entirely false.
*The studies are contradictory and no conclusions can be drawn*
- The studies are not contradictory because both **confidence intervals** show an OR > 1.0, and both reach **statistical significance**.
- Both groups found the same **direction of effect**, suggesting a real albeit modest genetic association with diabetes.
*The study with the lower p-value (Study B) is automatically more reliable*
- Reliability depends on **methodological rigor** and **precision**, whereas the p-value is heavily influenced by **sample size**.
- A lower p-value indicates stronger evidence against the **null hypothesis** but does not inherently mean the study is free from bias or more reliable in its effect estimate.
Relationship between CIs and hypothesis testing US Medical PG Question 9: A prestigious journal publishes a trial showing a new cancer drug extends survival by 2 months (p=0.001, 95% CI: 1.5-2.5 months). The drug costs $150,000 per patient and causes Grade 3-4 toxicity in 60% of patients. Three prior unpublished trials showed non-significant results (all p>0.20). Synthesize these findings to evaluate the evidence base.
- A. This pattern suggests publication bias; the significant result may be a false positive among multiple trials, and the modest benefit must be weighed against substantial toxicity and cost (Correct Answer)
- B. The confidence interval proves the drug should be standard of care
- C. P-values below 0.01 override concerns about prior negative studies
- D. The published study's highly significant p-value validates the drug's efficacy
- E. The three unpublished trials are irrelevant to evaluating the published study
Relationship between CIs and hypothesis testing Explanation: ***This pattern suggests publication bias; the significant result may be a false positive among multiple trials, and the modest benefit must be weighed against substantial toxicity and cost***
- The existence of three unpublished negative trials alongside one positive one strongly indicates **publication bias** (the file drawer effect), suggesting the positive result might be a **Type I error** or an overestimation.
- **Statistical significance** (p=0.001) does not equal **clinical significance**; a marginal 2-month survival gain must be balanced against extreme **financial cost** and a 60% rate of **Grade 3-4 toxicity**.
*The published study's highly significant p-value validates the drug's efficacy*
- A **low p-value** only indicates that the null hypothesis is unlikely within that specific trial; it does not account for the **context** of other failed experiments.
- Efficacy cannot be validated in isolation when the broader **evidence base** (including unpublished data) shows inconsistent results.
*The three unpublished trials are irrelevant to evaluating the published study*
- All relevant clinical trials must be synthesized via **meta-analysis** or systematic review to determine the true **effect size** of an intervention.
- Ignoring unpublished data leads to **evidence distortion**, where clinicians perceive a drug as more effective than it truly is.
*P-values below 0.01 override concerns about prior negative studies*
- No **p-value** can magically override the **prior probability** of a drug's success; consistent negative results in prior trials increase the likelihood that a later positive result is a **false positive**.
- High-impact medical decisions require a consistent **body of evidence** rather than a single outlier result, regardless of the level of significance.
*The confidence interval proves the drug should be standard of care*
- The **95% Confidence Interval** (1.5–2.5 months) tells us only about the **precision** of the measurement, not the **magnitude of clinical benefit**.
- Becoming a **standard of care** requires a favorable **risk-benefit ratio**, which is undermined here by severe **adverse events** and poor **cost-effectiveness**.
Relationship between CIs and hypothesis testing US Medical PG Question 10: A pharmaceutical company conducts 20 different analyses on their trial data, testing for effects on various secondary outcomes. One analysis shows a significant benefit (p=0.03) on hospital readmission rates. The primary outcome (mortality) showed p=0.12. The company seeks FDA approval based on the readmission data. Evaluate the validity and implications of this approach.
- A. Secondary outcomes are more important than primary outcomes when significant
- B. The p=0.03 result is valid and supports approval regardless of the primary outcome
- C. Any p<0.05 in a clinical trial justifies approval
- D. This represents multiple testing without correction, inflating Type I error; the significant result may be due to chance and selective reporting (Correct Answer)
- E. The mortality p-value of 0.12 is close enough to significance to support both findings
Relationship between CIs and hypothesis testing Explanation: ***This represents multiple testing without correction, inflating Type I error; the significant result may be due to chance and selective reporting***
- Performing **multiple comparisons** (20 analyses) without adjustment increases the probability of a **false positive** result; by chance alone, 1 out of 20 tests is expected to be significant at p < 0.05.
- Reliable conclusions require **post-hoc corrections** (like Bonferroni) or pre-specified hierarchies to prevent **selective reporting** or "p-hacking" of secondary outcomes.
*The p=0.03 result is valid and supports approval regardless of the primary outcome*
- A result is not considered valid in isolation when it is one of many tests; the **Type I error rate** is not maintained at 5%.
- Regulatory approval usually requires the **primary outcome** to be met, as secondary outcomes are generally considered **hypothesis-generating**.
*Secondary outcomes are more important than primary outcomes when significant*
- **Primary outcomes** are the pre-defined measures that the trial is specifically powered to detect; ignoring them leads to **bias**.
- Significance in a **secondary outcome** cannot supersede a non-significant primary outcome, especially when the test wasn't protected against multiple comparisons.
*The mortality p-value of 0.12 is close enough to significance to support both findings*
- In frequentist statistics, a **p-value of 0.12** is greater than the standard threshold of 0.05 and must be interpreted as **not statistically significant**.
- "Close" results do not validate other weak findings; they suggest the study failed to reject the **null hypothesis** for the most important clinical endpoint.
*Any p<0.05 in a clinical trial justifies approval*
- Approval requires evidence of both **statistical significance** and **clinical relevance**, typically demonstrated in the primary endpoint.
- **Spurious correlations** occur frequently in large datasets; therefore, a single p < 0.05 obtained through **data dredging** is insufficient for regulatory standards.
More Relationship between CIs and hypothesis testing US Medical PG questions available in the OnCourse app. Practice MCQs, flashcards, and get detailed explanations.