You are reading through a recent article that reports significant decreases in all-cause mortality for patients with malignant melanoma following treatment with a novel biological infusion. Which of the following choices refers to the probability that a study will find a statistically significant difference when one truly does exist?
Q72
On morning labs, a patient's potassium comes back at 5.9 mEq/L. The attending thinks that this result is spurious, and asks the team to repeat the electrolytes. Inadvertently, the medical student, intern, and resident all repeat the electrolytes that same morning. The following values are reported: 4.3 mEq/L, 4.2 mEq/L, and 4.2 mEq/L. What is the median potassium value for that patient that day including the first value?
Q73
A 24-year-old woman presents to a medical office for a follow-up evaluation. The medical history is significant for type 1 diabetes, for which she takes insulin. She was recently hospitalized for diabetic ketoacidosis following a respiratory infection. Today she brings in a list of her most recent early morning fasting blood glucose readings for review. Her glucose readings range from 126 mg/dL–134 mg/dL, except for 2 readings of 350 mg/dL and 380 mg/dL, taken at the onset of her recent hospitalization. Given this data set, which measure(s) of central tendency would be most likely affected by these additional extreme values?
Q74
A study is being conducted on depression using the Patient Health questionnaire (PHQ-9) survey data embedded within a popular social media network with a response size of 500,000 participants. The sample population of this study is approximately normal. The mean PHQ-9 score is 14, and the standard deviation is 4. How many participants have scores greater than 22?
Q75
A cross-sectional study is investigating the association between smoking and the presence of Raynaud phenomenon in adults presenting to a primary care clinic in a major city. A standardized 3-question survey that assesses symptoms of Raynaud phenomenon was used to clinically diagnosis patients if they answered positively to all 3 questions. Sociodemographics, health-related information, and smoking history were collected by trained interviewers. Subjects were grouped by their reported tobacco use: non-smokers, less than 1 pack per day (PPD), between 1-2 PPD, and over 2 PPD. The results were adjusted for gender, age, education, and alcohol consumption. The adjusted odds ratios (OR) were as follows:
Non-smoker: OR = reference
<1 PPD: OR = 1.49 [95% confidence interval (CI), 1.24-1.79]
1-2 PPD: OR = 1.91 [95% CI, 1.72-2.12]
>2 PPD: OR = 2.21 [95% CI, 2.14-2.37]
Which of the following is represented in this study and suggests a potential causal relationship between smoking and Raynaud phenomenon?
Q76
A research team develops a new monoclonal antibody checkpoint inhibitor for advanced melanoma that has shown promise in animal studies as well as high efficacy and low toxicity in early phase human clinical trials. The research team would now like to compare this drug to existing standard of care immunotherapy for advanced melanoma. The research team decides to conduct a non-randomized study where the novel drug will be offered to patients who are deemed to be at risk for toxicity with the current standard of care immunotherapy, while patients without such risk factors will receive the standard treatment. Which of the following best describes the level of evidence that this study can offer?
Q77
During a clinical study on an island with a population of 2540 individuals, 510 are found to have fasting hyperglycemia. Analysis of medical records of deceased individuals shows that the average age of onset of fasting hyperglycemia is 45 years, and the average life expectancy is 70 years. Assuming a steady state of population on the island with no change in environmental risk factors, which of the following is the best estimate of the number of individuals who would newly develop fasting hyperglycemia over 1 year?
Q78
A study aimed to evaluate the relationship between inflammatory markers and lipid metabolism in individuals with rheumatoid arthritis (RA) recruited 252 patients with RA in a tertiary care hospital. Fasting blood samples were taken for lipid profiling and for the assessment of inflammatory markers such as C-reactive protein (CRP) and erythrocyte sedimentation rate. The relationship between CRP and total cholesterol was assessed using Pearson’s correlation coefficient. A scatter plot between CRP and total cholesterol can be seen in the picture. Based on the scatter plot, which of the following can be correctly concluded about the value of the Pearson correlation coefficient, r, for CRP and total cholesterol?
Q79
The incidence of a relatively benign autosomal recessive disease, X, is 1 in 25 in the population. Assuming that the conditions for Hardy Weinberg Equilibrium are met, what is the probability that a male and female, who are carriers, will have a child expressing the disease?
Q80
A 21-year-old woman is diagnosed with a rare subtype of anti-NMDA encephalitis. During the diagnostic workup, she was found to have an ovarian teratoma. Her physician is curious about the association between anti-NMDA encephalitis and ovarian teratomas. A causal relationship between this subtype of anti-NMDA encephalitis and ovarian teratomas is suspected. The physician aims to identify patients with anti-NMDA encephalitis and subsequently evaluate them for the presence of ovarian teratomas. Which type of study design would be the most appropriate?
Study Design US Medical PG Practice Questions and MCQs
Question 71: You are reading through a recent article that reports significant decreases in all-cause mortality for patients with malignant melanoma following treatment with a novel biological infusion. Which of the following choices refers to the probability that a study will find a statistically significant difference when one truly does exist?
A. Type II error
B. Type I error
C. Confidence interval
D. p-value
E. Power (Correct Answer)
Explanation: ***Power***
- **Power** is the probability that a study will correctly reject the null hypothesis when it is, in fact, false (i.e., will find a statistically significant difference when one truly exists).
- A study with high power minimizes the risk of a **Type II error** (failing to detect a real effect).
*Type II error*
- A **Type II error** (or **beta error**) occurs when a study fails to reject a false null hypothesis, meaning it concludes there is no significant difference when one actually exists.
- This is the **opposite** of what the question describes, which asks for the probability of *finding* a difference.
*Type I error*
- A **Type I error** (or **alpha error**) occurs when a study incorrectly rejects a true null hypothesis, concluding there is a significant difference when one does not actually exist.
- This relates to the **p-value** and the level of statistical significance (e.g., p < 0.05).
*Confidence interval*
- A **confidence interval** provides a range of values within which the true population parameter is likely to lie with a certain degree of confidence (e.g., 95%).
- It does not directly represent the probability of finding a statistically significant difference when one truly exists.
*p-value*
- The **p-value** is the probability of observing data as extreme as, or more extreme than, that obtained in the study, assuming the null hypothesis is true.
- It is used to determine statistical significance, but it is not the probability of detecting a true effect.
Question 72: On morning labs, a patient's potassium comes back at 5.9 mEq/L. The attending thinks that this result is spurious, and asks the team to repeat the electrolytes. Inadvertently, the medical student, intern, and resident all repeat the electrolytes that same morning. The following values are reported: 4.3 mEq/L, 4.2 mEq/L, and 4.2 mEq/L. What is the median potassium value for that patient that day including the first value?
A. 4.3 mEq/L
B. 4.65 mEq/L
C. 4.25 mEq/L (Correct Answer)
D. 1.7 mEq/L
E. 4.2 mEq/L
Explanation: ***4.25 mEq/L***
- The question asks for the median including **all four potassium values**: 5.9, 4.3, 4.2, and 4.2 mEq/L.
- To find the **median**, first arrange the values in ascending order: **4.2, 4.2, 4.3, 5.9**.
- With an **even number of values (4)**, the median is the **average of the two middle numbers**: (4.2 + 4.3) / 2 = **4.25 mEq/L**.
- This correctly represents the **central tendency** of all laboratory values obtained that day.
*4.3 mEq/L*
- This is the **third value** in the sorted dataset (4.2, 4.2, 4.3, 5.9).
- This would be the median if there were an **odd number of values**, where you would simply take the middle value.
- With an even number of data points, you must **average the two middle values** (4.2 and 4.3), not select just one.
*4.65 mEq/L*
- This value (4.65) would result from incorrectly averaging **4.3 and 5.9**, perhaps by mistakenly identifying these as the two middle values.
- This could also result from averaging the **minimum (4.2) and maximum (5.9)** values: (4.2 + 5.9) / 2 = 5.05, though neither calculation yields exactly 4.65.
- The median requires proper sorting and identification of the **true middle position(s)** in the dataset.
*1.7 mEq/L*
- This value has **no mathematical relationship** to the given data (5.9, 4.3, 4.2, 4.2 mEq/L).
- This is a distractor representing **severe hypokalemia**, which is not supported by any of the laboratory values obtained.
- This might represent the **range** (5.9 - 4.2 = 1.7), though range is typically reported as a difference, not a standalone value.
*4.2 mEq/L*
- This is the **mode** of the dataset (the most frequently occurring value, appearing three times).
- While mode is a valid measure of central tendency, the question specifically asks for the **median**, not the mode.
- The median of this dataset (4.2, 4.2, 4.3, 5.9) is **4.25 mEq/L**, not 4.2 mEq/L.
Question 73: A 24-year-old woman presents to a medical office for a follow-up evaluation. The medical history is significant for type 1 diabetes, for which she takes insulin. She was recently hospitalized for diabetic ketoacidosis following a respiratory infection. Today she brings in a list of her most recent early morning fasting blood glucose readings for review. Her glucose readings range from 126 mg/dL–134 mg/dL, except for 2 readings of 350 mg/dL and 380 mg/dL, taken at the onset of her recent hospitalization. Given this data set, which measure(s) of central tendency would be most likely affected by these additional extreme values?
A. Mean (Correct Answer)
B. Median and mode
C. Median
D. Mean and median
E. Mode
Explanation: ***Mean***
* The **mean** is calculated by summing all values and dividing by the total number of values; thus, it is significantly influenced by **extreme values** or outliers.
* The two high blood glucose readings (350 mg/dL and 380 mg/dL) will **disproportionately increase** the mean, pulling it away from the central tendency of the majority of readings.
* *Median and mode*
* The **mode** is the most frequent value, which would likely still be within the 126-134 mg/dL range since most readings fall there, and the **median** (the middle value) is less affected by outliers.
* Even with two extreme values, the median of this dataset, assuming several readings in the 126-134 mg/dL range, would remain close to the central cluster of typical values and not be drastically altered.
* *Median*
* The **median** is resistant to outliers because it is determined by the position of values once ordered, not their magnitude.
* Adding a few extreme values will only shift the median slightly, if at all, especially if the sample size is large enough that the middle position remains within the range of typical values.
* *Mean and median*
* While the **mean** is heavily affected by outliers, the **median** is relatively robust to them.
* Therefore, stating that both would be significantly affected is incorrect because the median would largely retain its representation of the central tendency.
* *Mode*
* The **mode** represents the most frequently occurring value in a dataset and is not influenced by the magnitude of extreme values.
* Unless one of the extreme high readings happens to be the most frequently occurring value, the mode would remain within the range of the more common, lower glucose readings.
Question 74: A study is being conducted on depression using the Patient Health questionnaire (PHQ-9) survey data embedded within a popular social media network with a response size of 500,000 participants. The sample population of this study is approximately normal. The mean PHQ-9 score is 14, and the standard deviation is 4. How many participants have scores greater than 22?
A. 175,000
B. 17,500
C. 160,000
D. 12,500 (Correct Answer)
E. 25,000
Explanation: ***12,500***
- To find the number of participants with scores greater than 22, first calculate the **z-score** for a score of 22: $Z = \frac{(X - \mu)}{\sigma} = \frac{(22 - 14)}{4} = 2$.
- A z-score of 2 means the score is **2 standard deviations above the mean**. Using the **empirical rule** for a normal distribution, approximately **2.5%** of the data falls beyond 2 standard deviations above the mean (5% total in both tails, so 2.5% in each tail).
- Therefore, $2.5\%$ of the total 500,000 participants is $0.025 \times 500,000 = 12,500$.
*175,000*
- This option would imply a much larger proportion of the population scoring above 22, inconsistent with the **normal distribution's properties** and the calculated z-score.
- It would correspond to a z-score closer to 0, indicating a score closer to the mean, not two standard deviations above it.
*17,500*
- This value represents **3.5%** of the total population ($17,500 / 500,000 = 0.035$).
- A proportion of 3.5% above the mean corresponds to a z-score that is not exactly 2, indicating an incorrect calculation or interpretation of the **normal distribution table**.
*160,000*
- This option represents a very large portion of the participants, roughly **32%** of the total population.
- This percentage would correspond to scores within one standard deviation of the mean, not scores 2 standard deviations above the mean as calculated.
*25,000*
- This value represents **5%** of the total population ($25,000 / 500,000 = 0.05$).
- A z-score greater than 2 corresponds to the far tail of the normal distribution, where only 2.5% of the data lies, not 5%. This would correspond to a z-score of approximately 1.65.
Question 75: A cross-sectional study is investigating the association between smoking and the presence of Raynaud phenomenon in adults presenting to a primary care clinic in a major city. A standardized 3-question survey that assesses symptoms of Raynaud phenomenon was used to clinically diagnosis patients if they answered positively to all 3 questions. Sociodemographics, health-related information, and smoking history were collected by trained interviewers. Subjects were grouped by their reported tobacco use: non-smokers, less than 1 pack per day (PPD), between 1-2 PPD, and over 2 PPD. The results were adjusted for gender, age, education, and alcohol consumption. The adjusted odds ratios (OR) were as follows:
Non-smoker: OR = reference
<1 PPD: OR = 1.49 [95% confidence interval (CI), 1.24-1.79]
1-2 PPD: OR = 1.91 [95% CI, 1.72-2.12]
>2 PPD: OR = 2.21 [95% CI, 2.14-2.37]
Which of the following is represented in this study and suggests a potential causal relationship between smoking and Raynaud phenomenon?
A. Confounding
B. Blinding
C. Consistency
D. Temporality
E. Dose-response (Correct Answer)
Explanation: ***Dose-response***
- The study demonstrates a **dose-response relationship** as the odds ratio for Raynaud phenomenon increases with the reported packs per day (PPD) of tobacco use.
- This graded effect, where a higher exposure (more smoking) leads to a stronger outcome (higher odds of Raynaud phenomenon), is a strong indicator of a potential causal link according to the Bradford Hill criteria.
*Confounding*
- **Confounding** occurs when a third variable influences both the exposure and the outcome, creating a spurious association.
- The study specifically states that the results were **adjusted for gender, age, education, and alcohol consumption**, indicating an attempt to control for potential confounders, rather than confounding itself being represented as a causal link.
*Blinding*
- **Blinding** involves preventing participants or researchers from knowing who is receiving a particular treatment or exposure to reduce bias.
- While important in some study designs, this cross-sectional study describes **collected data** and adjusted odds ratios, not a process of blinding.
*Consistency*
- **Consistency** refers to the repeated observation of an association in different studies, populations, or circumstances.
- This study presents its own findings without reference to other research, so it does not demonstrate consistency; rather, it provides a single observation.
*Temporality*
- **Temporality** (or temporal relationship) means that the exposure must precede the outcome for a causal relationship to exist.
- This is a **cross-sectional study**, which assesses both exposure (smoking) and outcome (Raynaud phenomenon) at the same time, making it difficult to definitively establish temporality.
Question 76: A research team develops a new monoclonal antibody checkpoint inhibitor for advanced melanoma that has shown promise in animal studies as well as high efficacy and low toxicity in early phase human clinical trials. The research team would now like to compare this drug to existing standard of care immunotherapy for advanced melanoma. The research team decides to conduct a non-randomized study where the novel drug will be offered to patients who are deemed to be at risk for toxicity with the current standard of care immunotherapy, while patients without such risk factors will receive the standard treatment. Which of the following best describes the level of evidence that this study can offer?
A. Level 1
B. Level 3 (Correct Answer)
C. Level 5
D. Level 4
E. Level 2
Explanation: ***Level 3***
- A **non-randomized controlled trial** like the one described, where patient assignment to treatment groups is based on specific characteristics (risk of toxicity), falls into Level 3 evidence.
- This level typically includes **non-randomized controlled trials** and **well-designed cohort studies** with comparison groups, which are prone to selection bias and confounding.
- The study compares two treatments but lacks randomization, making it Level 3 evidence.
*Level 1*
- Level 1 evidence is the **highest level of evidence**, derived from **systematic reviews and meta-analyses** of multiple well-designed randomized controlled trials or large, high-quality randomized controlled trials.
- The described study is explicitly stated as non-randomized, ruling out Level 1.
*Level 2*
- Level 2 evidence involves at least one **well-designed randomized controlled trial** (RCT) or **systematic reviews** of randomized trials.
- The current study is *non-randomized*, which means it cannot be classified as Level 2 evidence, as randomization is a key criterion for this level.
*Level 4*
- Level 4 evidence includes **case series**, **case-control studies**, and **poorly designed cohort or case-control studies**.
- While the study is non-randomized, it is a controlled comparative trial rather than a case series or retrospective case-control study, placing it at Level 3.
*Level 5*
- Level 5 evidence is the **lowest level of evidence**, typically consisting of **expert opinion** without explicit critical appraisal, or based on physiology, bench research, or animal studies.
- While the drug was initially tested in animal studies, the current human comparative study offers a higher level of evidence than expert opinion or preclinical data.
Question 77: During a clinical study on an island with a population of 2540 individuals, 510 are found to have fasting hyperglycemia. Analysis of medical records of deceased individuals shows that the average age of onset of fasting hyperglycemia is 45 years, and the average life expectancy is 70 years. Assuming a steady state of population on the island with no change in environmental risk factors, which of the following is the best estimate of the number of individuals who would newly develop fasting hyperglycemia over 1 year?
A. 20 (Correct Answer)
B. 50
C. 10
D. 30
E. 40
Explanation: ***Correct Option: 20***
- In a steady-state population, prevalence remains constant when the number of new cases (incidence) equals the number of individuals exiting the disease state (through death from any cause).
- The average duration of fasting hyperglycemia is **life expectancy (70 years) - age of onset (45 years) = 25 years**.
- Using the fundamental relationship **Prevalence = Incidence × Duration**, we can solve for incidence: **Incidence = Prevalence / Duration = 510 / 25 = 20.4 ≈ 20 new cases per year**.
- This means approximately 20 individuals must newly develop fasting hyperglycemia each year to maintain the steady-state prevalence of 510 cases.
*Incorrect Option: 50*
- This would imply a much higher incidence rate, inconsistent with maintaining a steady state.
- If 50 new cases developed annually with an average 25-year duration, the prevalence would be 50 × 25 = 1,250 cases, far exceeding the observed 510.
- This represents an incidence rate 2.5 times higher than what the steady-state equation supports.
*Incorrect Option: 10*
- This represents an incidence rate that is too low to maintain the observed prevalence in a steady-state population.
- With only 10 new cases per year and a 25-year duration, the steady-state prevalence would be 10 × 25 = 250 cases, which is half the observed 510.
- This choice would suggest either a longer disease duration or a declining prevalence over time.
*Incorrect Option: 30*
- This is 1.5 times the calculated incidence, suggesting an expanding prevalence rather than a steady state.
- With 30 new cases annually over a 25-year duration, the steady-state prevalence would reach 750 cases, exceeding the observed 510.
- While closer than other incorrect options, it violates the fundamental principle that Prevalence = Incidence × Duration.
*Incorrect Option: 40*
- This value is twice the calculated incidence, indicating a scenario where prevalence would be rapidly increasing.
- If 40 new cases developed per year with a 25-year duration, the steady-state prevalence would be 1,000 cases, nearly double the observed 510.
- This contradicts the assumption of a steady-state population with stable disease prevalence.
Question 78: A study aimed to evaluate the relationship between inflammatory markers and lipid metabolism in individuals with rheumatoid arthritis (RA) recruited 252 patients with RA in a tertiary care hospital. Fasting blood samples were taken for lipid profiling and for the assessment of inflammatory markers such as C-reactive protein (CRP) and erythrocyte sedimentation rate. The relationship between CRP and total cholesterol was assessed using Pearson’s correlation coefficient. A scatter plot between CRP and total cholesterol can be seen in the picture. Based on the scatter plot, which of the following can be correctly concluded about the value of the Pearson correlation coefficient, r, for CRP and total cholesterol?
A. r value is exactly +1
B. r value is exactly 0
C. r value is exactly -1
D. r value lies between 0 and +1
E. r value lies between 0 and -1 (Correct Answer)
Explanation: ***r value lies between 0 and -1***
- The scatter plot shows a **negative association** between CRP and total cholesterol meaning as CRP levels increase, total cholesterol tends to decrease, indicating a negative correlation.
- The data points are somewhat scattered and do not form a perfect straight line, so the correlation is not exactly -1 but falls between **0 and -1**.
*r value is exactly +1*
- An r-value of **+1** would indicate a **perfect positive linear relationship**, where all data points fall precisely on a straight line that slopes upwards meaning as CRP increases total cholesterol also increases.
- The scatter plot clearly shows a downward trend, which contradicts a positive correlation.
*r value is exactly 0*
- An r-value of **0** would suggest **no linear relationship** between the variables, meaning the points would be randomly scattered with no discernible trend.
- The scatter plot demonstrates a clear trend, albeit a negative one, indicating that there is a relationship between CRP and total cholesterol.
*r value is exactly -1*
- An r-value of **-1** would signify a **perfect negative linear relationship**, where all data points would lie exactly on a straight line that slopes downwards.
- While there is a negative trend, the data points show significant scattering, indicating that the relationship is not perfectly linear.
*r value lies between 0 and +1*
- An r-value between **0 and +1** would imply a **positive, but not perfect, linear relationship**.
- This is incorrect because the scatter plot visually depicts a **negative relationship**, where one variable tends to decrease as the other increases.
Question 79: The incidence of a relatively benign autosomal recessive disease, X, is 1 in 25 in the population. Assuming that the conditions for Hardy Weinberg Equilibrium are met, what is the probability that a male and female, who are carriers, will have a child expressing the disease?
A. 1/5
B. 8/25
C. 1/4 (Correct Answer)
D. 1/25
E. 4/5
Explanation: ***1/4***
- If both parents are **carriers** for an autosomal recessive disease, each parent has one copy of the normal allele (A) and one copy of the recessive allele (a).
- When two heterozygous (Aa) individuals mate, the probability of their child inheriting two recessive alleles (aa) and expressing the disease is 1 in 4 (25%), according to Mendelian genetics.
*1/5*
- This value represents the **allele frequency (q)** in the population for the recessive allele, given an incidence of 1 in 25 (q^2 = 1/25, so q = 1/5).
- However, this is not the probability of a child being affected if both parents are already known to be carriers.
*8/25*
- This option is incorrect and does not directly relate to the probability of an affected child from two known carriers.
- It might represent a miscalculation involving carrier frequencies or a different genetic scenario.
*1/25*
- This is the **incidence of the disease (q^2)** in the general population, which means 1 out of 25 individuals express the disease.
- It is not the probability of a child inheriting the disease from two parents already identified as carriers.
*4/5*
- This value represents the **allele frequency (p)** of the dominant allele (p = 1 - q = 1 - 1/5 = 4/5).
- It is not the probability of a child expressing the disease from two carrier parents.
Question 80: A 21-year-old woman is diagnosed with a rare subtype of anti-NMDA encephalitis. During the diagnostic workup, she was found to have an ovarian teratoma. Her physician is curious about the association between anti-NMDA encephalitis and ovarian teratomas. A causal relationship between this subtype of anti-NMDA encephalitis and ovarian teratomas is suspected. The physician aims to identify patients with anti-NMDA encephalitis and subsequently evaluate them for the presence of ovarian teratomas. Which type of study design would be the most appropriate?
A. Case-control study
B. Retrospective cohort study (Correct Answer)
C. Cross-sectional study
D. Case series
E. Randomized controlled trial
Explanation: ***Retrospective cohort study***
- This is the **most appropriate design** because the physician starts with a defined group of patients **with anti-NMDA encephalitis** (the exposure/condition) and then evaluates them for the **presence of ovarian teratomas** (the outcome).
- A **cohort study** follows this directional approach: identify individuals with a specific exposure or condition, then assess the frequency or presence of an outcome within that group.
- **Retrospective** cohort studies use **existing medical records** to identify the exposed cohort and determine outcome status, making this practical for studying a rare condition like anti-NMDA encephalitis.
- This design allows calculation of the **prevalence** of ovarian teratomas among anti-NMDA encephalitis patients and can suggest an association between the two conditions.
*Cross-sectional study*
- Cross-sectional studies assess **both exposure and outcome simultaneously** at a single point in time in a population, rather than starting with one condition and looking for another.
- This design would be appropriate if the physician surveyed a population and assessed both anti-NMDA encephalitis and ovarian teratomas at the same time, but the question describes a **directional evaluation** (first identify encephalitis patients, then evaluate for teratomas).
- While cross-sectional studies can identify associations, they do not follow the sequential approach described in the clinical scenario.
*Case series*
- A **case series** is a descriptive study that reports characteristics or outcomes in a group of patients with a particular condition but lacks a comparison group and does not systematically evaluate associations.
- While it could describe ovarian teratoma findings in anti-NMDA encephalitis patients, it does not provide the structured framework for assessing prevalence or association that a cohort study offers.
*Case-control study*
- **Case-control studies** work in the **opposite direction**: they start with the outcome (e.g., ovarian teratoma cases) and look backward for the exposure (e.g., anti-NMDA encephalitis).
- The physician's approach starts with the **exposure first** (anti-NMDA encephalitis), making a case-control design inappropriate.
- Case-control studies are efficient for studying rare outcomes but are not aligned with the described study plan.
*Randomized controlled trial*
- **RCTs** are experimental studies that randomly assign participants to different interventions to evaluate treatment efficacy or causation.
- This is an **observational research question** about naturally occurring associations, not an intervention study, making RCTs inappropriate and unethical for this scenario.