P-value controversy and alternatives US Medical PG Practice Questions and MCQs
Practice US Medical PG questions for P-value controversy and alternatives. These multiple choice questions (MCQs) cover important concepts and help you prepare for your exams.
P-value controversy and alternatives US Medical PG Question 1: Group of 100 medical students took an end of the year exam. The mean score on the exam was 70%, with a standard deviation of 25%. The professor states that a student's score must be within the 95% confidence interval of the mean to pass the exam. Which of the following is the minimum score a student can have to pass the exam?
- A. 45%
- B. 63.75%
- C. 67.5%
- D. 20%
- E. 65% (Correct Answer)
P-value controversy and alternatives Explanation: ***65%***
- To find the **95% confidence interval (CI) of the mean**, we use the formula: Mean ± (Z-score × Standard Error). For a 95% CI, the Z-score is approximately **1.96**.
- The **Standard Error (SE)** is calculated as SD/√n, where n is the sample size (100 students). So, SE = 25%/√100 = 25%/10 = **2.5%**.
- The 95% CI is 70% ± (1.96 × 2.5%) = 70% ± 4.9%. The lower bound is 70% - 4.9% = **65.1%**, which rounds to **65%** as the minimum passing score.
*45%*
- This value is significantly lower than the calculated lower bound of the 95% confidence interval (approximately 65.1%).
- It would represent a score far outside the defined passing range.
*63.75%*
- This value falls below the calculated lower bound of the 95% confidence interval (approximately 65.1%).
- While close, this score would not meet the professor's criterion for passing.
*67.5%*
- This value is within the 95% confidence interval (65.1% to 74.9%) but is **not the minimum score**.
- Lower scores within the interval would still qualify as passing.
*20%*
- This score is extremely low and falls significantly outside the 95% confidence interval for a mean of 70%.
- It would indicate performance far below the defined passing threshold.
P-value controversy and alternatives US Medical PG Question 2: You are reading through a recent article that reports significant decreases in all-cause mortality for patients with malignant melanoma following treatment with a novel biological infusion. Which of the following choices refers to the probability that a study will find a statistically significant difference when one truly does exist?
- A. Type II error
- B. Type I error
- C. Confidence interval
- D. p-value
- E. Power (Correct Answer)
P-value controversy and alternatives Explanation: ***Power***
- **Power** is the probability that a study will correctly reject the null hypothesis when it is, in fact, false (i.e., will find a statistically significant difference when one truly exists).
- A study with high power minimizes the risk of a **Type II error** (failing to detect a real effect).
*Type II error*
- A **Type II error** (or **beta error**) occurs when a study fails to reject a false null hypothesis, meaning it concludes there is no significant difference when one actually exists.
- This is the **opposite** of what the question describes, which asks for the probability of *finding* a difference.
*Type I error*
- A **Type I error** (or **alpha error**) occurs when a study incorrectly rejects a true null hypothesis, concluding there is a significant difference when one does not actually exist.
- This relates to the **p-value** and the level of statistical significance (e.g., p < 0.05).
*Confidence interval*
- A **confidence interval** provides a range of values within which the true population parameter is likely to lie with a certain degree of confidence (e.g., 95%).
- It does not directly represent the probability of finding a statistically significant difference when one truly exists.
*p-value*
- The **p-value** is the probability of observing data as extreme as, or more extreme than, that obtained in the study, assuming the null hypothesis is true.
- It is used to determine statistical significance, but it is not the probability of detecting a true effect.
P-value controversy and alternatives US Medical PG Question 3: A pharmaceutical corporation is developing a research study to evaluate a novel blood test to screen for breast cancer. They enrolled 800 patients in the study, half of which have breast cancer. The remaining enrolled patients are age-matched controls who do not have the disease. Of those in the diseased arm, 330 are found positive for the test. Of the patients in the control arm, only 30 are found positive. What is this test’s sensitivity?
- A. 330 / (330 + 30)
- B. 330 / (330 + 70) (Correct Answer)
- C. 370 / (30 + 370)
- D. 370 / (70 + 370)
- E. 330 / (400 + 400)
P-value controversy and alternatives Explanation: ***330 / (330 + 70)***
- **Sensitivity** measures the proportion of actual **positives** that are correctly identified as such.
- In this study, there are **400 diseased patients** (half of 800). Of these, 330 tested positive (true positives), meaning 70 tested negative (false negatives). So sensitivity is **330 / (330 + 70)**.
*330 / (330 + 30)*
- This calculation represents the **positive predictive value**, which is the probability that subjects with a positive screening test truly have the disease. It uses **true positives / (true positives + false positives)**.
- It does not correctly calculate **sensitivity**, which requires knowing the total number of diseased individuals.
*370 / (30 + 370)*
- This expression is attempting to calculate **specificity**, which is the proportion of actual negatives that are correctly identified. It would be **true negatives / (true negatives + false positives)**.
- However, the numbers used are incorrect for specificity in this context given the data provided.
*370 / (70 + 370)*
- This formula is an incorrect combination of values and does not represent any standard epidemiological measure like **sensitivity** or **specificity**.
- It is attempting to combine false negatives (70) and true negatives (370 from control arm) in a non-standard way.
*330 / (400 + 400)*
- This calculation attempts to divide true positives by the total study population (800 patients).
- This metric represents the **prevalence of true positives within the entire study cohort**, not the test's **sensitivity**.
P-value controversy and alternatives US Medical PG Question 4: During an evaluation of a new diagnostic imaging modality for detecting salivary gland tumors, 90 patients tested positive out of the 100 patients who tested positive with the gold standard test. A total of 80 individuals tested negative with the new test out of the 100 individuals who tested negative with the gold standard test. What is the positive likelihood ratio for this test?
- A. 80/90
- B. 90/100
- C. 90/20 (Correct Answer)
- D. 90/110
- E. 10/80
P-value controversy and alternatives Explanation: ***90/20***
- The **positive likelihood ratio (LR+)** is calculated as **sensitivity / (1 - specificity)**. To calculate this, we first need to determine the values for true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).
- Given that 90 out of 100 actual positive patients tested positive, **TP = 90** and **FN = 100 - 90 = 10**. Also, 80 out of 100 actual negative patients tested negative, so **TN = 80** and **FP = 100 - 80 = 20**.
- **Sensitivity** is the true positive rate (TP / (TP + FN)) = 90 / (90 + 10) = 90 / 100.
- **Specificity** is the true negative rate (TN / (TN + FP)) = 80 / (80 + 20) = 80 / 100.
- Therefore, LR+ = (90/100) / (1 - 80/100) = (90/100) / (20/100) = **90/20**.
*80/90*
- This option incorrectly represents the components for the likelihood ratio. It seems to misinterpret the **true negative** count and the **true positive** count.
- It does not follow the formula for LR+ which is **sensitivity / (1 - specificity)**.
*90/100*
- This value represents the **sensitivity** of the test, which is the proportion of true positives among all actual positives.
- It does not incorporate the **false positive rate** (1 - specificity) in the denominator required for the positive likelihood ratio.
*90/110*
- This option incorrectly combines different values, possibly by confusing the denominator for sensitivity or specificity calculations.
- It does not correspond to the formula for the **positive likelihood ratio**.
*10/80*
- This value seems to relate to the inverse of the **false negative rate** (10/100) or misrepresents the relationship between false negatives and true negatives.
- It is not correctly structured to represent the **positive likelihood ratio (LR+)**.
P-value controversy and alternatives US Medical PG Question 5: You are currently employed as a clinical researcher working on clinical trials of a new drug to be used for the treatment of Parkinson's disease. Currently, you have already determined the safe clinical dose of the drug in a healthy patient. You are in the phase of drug development where the drug is studied in patients with the target disease to determine its efficacy. Which of the following phases is this new drug currently in?
- A. Phase 4
- B. Phase 1
- C. Phase 2 (Correct Answer)
- D. Phase 0
- E. Phase 3
P-value controversy and alternatives Explanation: ***Phase 2***
- **Phase 2 trials** involve studying the drug in patients with the target disease to assess its **efficacy** and further evaluate safety, typically involving a few hundred patients.
- The question describes a stage after safe dosing in healthy patients (Phase 1) and before large-scale efficacy confirmation (Phase 3), focusing on efficacy in the target population.
*Phase 4*
- **Phase 4 trials** occur **after a drug has been approved** and marketed, monitoring long-term effects, optimal use, and rare side effects in a diverse patient population.
- This phase is conducted post-market approval, whereas the question describes a drug still in development prior to approval.
*Phase 1*
- **Phase 1 trials** primarily focus on determining the **safety and dosage** of a new drug in a **small group of healthy volunteers** (or sometimes patients with advanced disease if the drug is highly toxic).
- The question states that the safe clinical dose in a healthy patient has already been determined, indicating that Phase 1 has been completed.
*Phase 0*
- **Phase 0 trials** are exploratory, very early-stage studies designed to confirm that the drug reaches the target and acts as intended, typically involving a very small number of doses and participants.
- These trials are conducted much earlier in the development process, preceding the determination of safe clinical doses and large-scale efficacy studies.
*Phase 3*
- **Phase 3 trials** are large-scale studies involving hundreds to thousands of patients to confirm **efficacy**, monitor side effects, compare it to commonly used treatments, and collect information that will allow the drug to be used safely.
- While Phase 3 does assess efficacy, it follows Phase 2 and is typically conducted on a much larger scale before submitting for regulatory approval.
P-value controversy and alternatives US Medical PG Question 6: In 2013 the national mean score on the USMLE Step 1 exam was 227 with a standard deviation of 22. Assuming that the scores for 15,000 people follow a normal distribution, approximately how many students scored above the mean but below 250?
- A. 5,100 (Correct Answer)
- B. 4,500
- C. 6,000
- D. 3,750
- E. 6,750
P-value controversy and alternatives Explanation: ***5,100***
- To solve this, first calculate the **z-score** for 250: (250 - 227) / 22 = 1.045.
- Using a **z-table**, the area under the curve from the mean (z=0) to z=1.045 is approximately 0.353. Multiplying this by 15,000 students gives approximately **5,295 students**, which is closest to 5,100.
*4,500*
- This answer would imply a smaller proportion of students between the mean and 250 (around 30%), which is lower than the calculated z-score of 1.045 suggests.
- It does not accurately reflect the area under the **normal distribution curve** for the given range.
*6,000*
- This option would mean that approximately 40% of students scored in this range, which would correspond to a z-score much higher than 1.045 or a different standard deviation.
- This calculation overestimates the number of students within the specified range.
*3,750*
- This value represents 25% of the total students (15,000 * 0.25), indicating that only a quarter of the distribution lies in this range.
- This significantly underestimates the proportion of students scoring between the mean and 250 for the given standard deviation.
*6,750*
- This option reflects approximately 45% of the total student population (15,000 * 0.45), which would correspond to a much larger z-score or a different distribution.
- This value is an overestimation and does not align with the standard normal distribution probabilities for the given parameters.
P-value controversy and alternatives US Medical PG Question 7: A pharmaceutical company has modified one of its existing antibiotics to have an improved toxicity profile. The new antibiotic blocks protein synthesis by first entering the cell and then binding to active ribosomes. The antibiotic mimics the structure of aminoacyl-tRNA. The drug is covalently bonded to the existing growing peptide chain via peptidyl transferase, thereby impairing the rest of protein synthesis and leading to early polypeptide truncation. Where is the most likely site that this process occurs?
- A. E site
- B. 30S small subunit
- C. A site (Correct Answer)
- D. 40S small subunit
- E. P site
P-value controversy and alternatives Explanation: ***A site***
- The **A (aminoacyl) site** is where incoming aminoacyl-tRNAs bind during translation, bringing new amino acids to the ribosome. Since the antibiotic mimics **aminoacyl-tRNA** and is covalently bonded to the peptide chain by **peptidyl transferase**, its action must occur at the A site.
- Binding at the A site and subsequent peptide bond formation with the antibiotic would lead to premature polypeptide truncation, as no further amino acids can be added.
*E site*
- The **E (exit) site** is where deacylated tRNAs are released from the ribosome after having delivered their amino acid to the growing peptide chain in the P site.
- The antibiotic's mechanism of action, involving binding and covalent incorporation into the peptide, does not align with the function of the E site.
*30S small subunit*
- The **30S small ribosomal subunit** in prokaryotes is primarily involved in mRNA binding and decoding, ensuring the correct aminoacyl-tRNA binds to the mRNA codon.
- While the antibiotic binds to active ribosomes, its key action described as mimicking aminoacyl-tRNA and being incorporated by peptidyl transferase points to a specific binding site within the ribosome rather than the entire subunit's general function.
*40S small subunit*
- The **40S small ribosomal subunit** is found in **eukaryotic ribosomes**, not prokaryotic ones, and is involved in mRNA binding during initiation.
- The question implies an antibiotic targeting bacterial protein synthesis (given its discussion of modifying an existing antibiotic), making eukaryotic ribosomal subunits an unlikely target.
*P site*
- The **P (peptidyl) site** holds the tRNA carrying the growing polypeptide chain. Peptidyl transferase activity forms a peptide bond between the amino acid in the A site and the peptide in the P site.
- While peptidyl transferase is involved, the antibiotic *mimics* aminoacyl-tRNA, which is delivered to the A site for peptide bond formation, rather than the P site which already holds the growing chain.
P-value controversy and alternatives US Medical PG Question 8: Two research groups independently study the same genetic variant's association with diabetes. Study A (n=5,000) reports OR=1.25, 95% CI: 1.05-1.48, p=0.01. Study B (n=50,000) reports OR=1.08, 95% CI: 1.02-1.14, p=0.006. Both studies are methodologically sound. Synthesize these findings to determine the most likely true effect and evaluate implications for clinical and research interpretation.
- A. Study B is definitive because of its larger sample size and should replace Study A's findings
- B. The study with the lower p-value (Study B) is automatically more reliable
- C. The studies are contradictory and no conclusions can be drawn
- D. Study A is correct because it was published first
- E. The true effect is likely modest (closer to Study B's estimate); Study A likely overestimated due to smaller sample size, but both show statistical significance with clinically marginal effects (Correct Answer)
P-value controversy and alternatives Explanation: ***The true effect is likely modest (closer to Study B's estimate); Study A likely overestimated due to smaller sample size, but both show statistical significance with clinically marginal effects***
- Study B has significantly higher **statistical power** and **precision** (narrower 95% CI) due to its larger sample size, making its **odds ratio (OR)** estimate more reliable.
- Smaller initial studies often exhibit the **Winner's Curse**, where effect sizes are **overestimated** to reach the threshold for statistical significance.
*Study A is correct because it was published first*
- **Publication order** does not determine the scientific validity or accuracy of genetic association studies.
- Early studies are more prone to **random error** and inflated effect sizes compared to later, larger-scale replications.
*Study B is definitive because of its larger sample size and should replace Study A's findings*
- While Study B is more **precise**, both studies are directionally consistent and both show **statistical significance** (p < 0.05).
- Scientific evidence is **cumulative**; Study B refines and confirms the existence of an association rather than declaring Study A's findings as entirely false.
*The studies are contradictory and no conclusions can be drawn*
- The studies are not contradictory because both **confidence intervals** show an OR > 1.0, and both reach **statistical significance**.
- Both groups found the same **direction of effect**, suggesting a real albeit modest genetic association with diabetes.
*The study with the lower p-value (Study B) is automatically more reliable*
- Reliability depends on **methodological rigor** and **precision**, whereas the p-value is heavily influenced by **sample size**.
- A lower p-value indicates stronger evidence against the **null hypothesis** but does not inherently mean the study is free from bias or more reliable in its effect estimate.
P-value controversy and alternatives US Medical PG Question 9: A prestigious journal publishes a trial showing a new cancer drug extends survival by 2 months (p=0.001, 95% CI: 1.5-2.5 months). The drug costs $150,000 per patient and causes Grade 3-4 toxicity in 60% of patients. Three prior unpublished trials showed non-significant results (all p>0.20). Synthesize these findings to evaluate the evidence base.
- A. This pattern suggests publication bias; the significant result may be a false positive among multiple trials, and the modest benefit must be weighed against substantial toxicity and cost (Correct Answer)
- B. The confidence interval proves the drug should be standard of care
- C. P-values below 0.01 override concerns about prior negative studies
- D. The published study's highly significant p-value validates the drug's efficacy
- E. The three unpublished trials are irrelevant to evaluating the published study
P-value controversy and alternatives Explanation: ***This pattern suggests publication bias; the significant result may be a false positive among multiple trials, and the modest benefit must be weighed against substantial toxicity and cost***
- The existence of three unpublished negative trials alongside one positive one strongly indicates **publication bias** (the file drawer effect), suggesting the positive result might be a **Type I error** or an overestimation.
- **Statistical significance** (p=0.001) does not equal **clinical significance**; a marginal 2-month survival gain must be balanced against extreme **financial cost** and a 60% rate of **Grade 3-4 toxicity**.
*The published study's highly significant p-value validates the drug's efficacy*
- A **low p-value** only indicates that the null hypothesis is unlikely within that specific trial; it does not account for the **context** of other failed experiments.
- Efficacy cannot be validated in isolation when the broader **evidence base** (including unpublished data) shows inconsistent results.
*The three unpublished trials are irrelevant to evaluating the published study*
- All relevant clinical trials must be synthesized via **meta-analysis** or systematic review to determine the true **effect size** of an intervention.
- Ignoring unpublished data leads to **evidence distortion**, where clinicians perceive a drug as more effective than it truly is.
*P-values below 0.01 override concerns about prior negative studies*
- No **p-value** can magically override the **prior probability** of a drug's success; consistent negative results in prior trials increase the likelihood that a later positive result is a **false positive**.
- High-impact medical decisions require a consistent **body of evidence** rather than a single outlier result, regardless of the level of significance.
*The confidence interval proves the drug should be standard of care*
- The **95% Confidence Interval** (1.5–2.5 months) tells us only about the **precision** of the measurement, not the **magnitude of clinical benefit**.
- Becoming a **standard of care** requires a favorable **risk-benefit ratio**, which is undermined here by severe **adverse events** and poor **cost-effectiveness**.
P-value controversy and alternatives US Medical PG Question 10: A pharmaceutical company conducts 20 different analyses on their trial data, testing for effects on various secondary outcomes. One analysis shows a significant benefit (p=0.03) on hospital readmission rates. The primary outcome (mortality) showed p=0.12. The company seeks FDA approval based on the readmission data. Evaluate the validity and implications of this approach.
- A. Secondary outcomes are more important than primary outcomes when significant
- B. The p=0.03 result is valid and supports approval regardless of the primary outcome
- C. Any p<0.05 in a clinical trial justifies approval
- D. This represents multiple testing without correction, inflating Type I error; the significant result may be due to chance and selective reporting (Correct Answer)
- E. The mortality p-value of 0.12 is close enough to significance to support both findings
P-value controversy and alternatives Explanation: ***This represents multiple testing without correction, inflating Type I error; the significant result may be due to chance and selective reporting***
- Performing **multiple comparisons** (20 analyses) without adjustment increases the probability of a **false positive** result; by chance alone, 1 out of 20 tests is expected to be significant at p < 0.05.
- Reliable conclusions require **post-hoc corrections** (like Bonferroni) or pre-specified hierarchies to prevent **selective reporting** or "p-hacking" of secondary outcomes.
*The p=0.03 result is valid and supports approval regardless of the primary outcome*
- A result is not considered valid in isolation when it is one of many tests; the **Type I error rate** is not maintained at 5%.
- Regulatory approval usually requires the **primary outcome** to be met, as secondary outcomes are generally considered **hypothesis-generating**.
*Secondary outcomes are more important than primary outcomes when significant*
- **Primary outcomes** are the pre-defined measures that the trial is specifically powered to detect; ignoring them leads to **bias**.
- Significance in a **secondary outcome** cannot supersede a non-significant primary outcome, especially when the test wasn't protected against multiple comparisons.
*The mortality p-value of 0.12 is close enough to significance to support both findings*
- In frequentist statistics, a **p-value of 0.12** is greater than the standard threshold of 0.05 and must be interpreted as **not statistically significant**.
- "Close" results do not validate other weak findings; they suggest the study failed to reject the **null hypothesis** for the most important clinical endpoint.
*Any p<0.05 in a clinical trial justifies approval*
- Approval requires evidence of both **statistical significance** and **clinical relevance**, typically demonstrated in the primary endpoint.
- **Spurious correlations** occur frequently in large datasets; therefore, a single p < 0.05 obtained through **data dredging** is insufficient for regulatory standards.
More P-value controversy and alternatives US Medical PG questions available in the OnCourse app. Practice MCQs, flashcards, and get detailed explanations.