What percentage of values in a normal distribution lie within one standard deviation of the mean?
Which of the following is Pearson's measure of skewness?
In a group, the mean blood glucose level is 105 mg/dL with a standard deviation of 10 mg/dL. Assuming a normal distribution, what is the expected range of blood glucose values for approximately 95% of the population?
In a study, an investigator compared the mean blood pressure and standard deviation between two independent groups. Which of the following is the most appropriate statistical test to assess the significance of the difference between the two means?
In a normal distribution, the mean value is 82, the SD is 1.5. Calculate the range of two standard deviations?
A total of 2000 patients were assessed for HIV. 200 were diagnosed positive. A new ELISA screening test was tested on the same group. It showed 260 as positive out of which only 130 had the disease. What is the specificity of the test?
What is the denominator in infant mortality rate?
The urban area of Delhi has 4000 people with different religions. Research is being done to study the dietary habits of the population. Which of the following techniques can be used to obtain a study sample?
In a Primary Health Centre (PHC), a study was done on diabetic patients. Mean weight was recorded before and after 6 months of dietary intervention. Which statistical test should be used to determine significance?
Which of the following is incorrectly matched with respect to the statistical parameter?
Explanation: This question tests your knowledge of the **Normal Distribution (Gaussian) Curve**, a fundamental concept in biostatistics used to describe how continuous variables (like height, blood pressure, or hemoglobin levels) are distributed in a population. ### **Explanation of the Correct Answer** In a perfectly symmetrical, bell-shaped normal distribution, the mean, median, and mode coincide at the center. The spread of data is measured by the **Standard Deviation (SD)**. According to the **Empirical Rule** (also known as the 68-95-99.7 rule): * **Mean ± 1 SD** covers approximately **68.2%** of the values. * **Mean ± 2 SD** covers approximately **95.4%** of the values. * **Mean ± 3 SD** covers approximately **99.7%** of the values. Therefore, **68%** is the correct percentage for values within one standard deviation. ### **Analysis of Incorrect Options** * **A. 50%:** In a normal distribution, 50% of values lie below the mean and 50% lie above it. It does not represent the range of one SD. * **C. 95%:** This represents the area covered by **two standard deviations** (specifically 1.96 SD is used for the 95% Confidence Interval). * **D. 100%:** Theoretically, the tails of a normal distribution curve are asymptotic (they never touch the x-axis), meaning it extends to infinity. 100% is never technically reached within a finite number of SDs. ### **High-Yield Clinical Pearls for NEET-PG** 1. **Z-Score:** This indicates how many standard deviations a data point is from the mean. A Z-score of +1 corresponds to the 84th percentile. 2. **Confidence Interval (CI):** For a 95% CI, we use the formula: $Mean \pm 1.96 \times SEM$ (Standard Error of Mean). 3. **Skewness:** If the tail is longer on the right, it is **Positively Skewed** (Mean > Median > Mode). If longer on the left, it is **Negatively Skewed** (Mode > Median > Mean). 4. **Standard Normal Distribution:** A specific normal distribution where the **Mean is 0** and the **Standard Deviation is 1**.
Explanation: ### Explanation **Pearson’s Coefficient of Skewness** is a measure used in biostatistics to determine the asymmetry of a probability distribution. In a perfectly symmetrical (Normal) distribution, the Mean, Median, and Mode are equal, resulting in a skewness of zero. #### 1. Why Option B is Correct Karl Pearson’s first coefficient of skewness is defined by the formula: **Skewness = (Mean – Mode) / Standard Deviation (SD)** * **Logic:** It measures how far the Mean is pulled away from the Mode relative to the dispersion (SD) of the data. * **Directionality:** * If **Mean > Mode**, the result is positive (**Positive Skew**; tail to the right). * If **Mean < Mode**, the result is negative (**Negative Skew**; tail to the left). #### 2. Why Other Options are Incorrect * **Option A (Mode - Mean / SD):** This is the inverse of the correct formula and would incorrectly sign the direction of the skew. * **Option C (SD / Mode - Mean):** This is mathematically incorrect; the Standard Deviation serves as the denominator to "standardize" the measure, not the numerator. * **Option D:** This is a duplicate of the correct answer in the prompt, but the fundamental formula remains Mean minus Mode divided by SD. #### 3. High-Yield Clinical Pearls for NEET-PG * **Alternative Formula:** Since the Mode can be unstable in some datasets, Pearson’s second coefficient is often used: **3 (Mean – Median) / SD**. * **Relationship in Skewed Data:** * **Positively Skewed:** Mean > Median > Mode (e.g., income distribution, incubation periods). * **Negatively Skewed:** Mode > Median > Mean (e.g., age at death in developed countries). * **Memory Aid:** In a positive skew, the "Mean" is the "Meanest" (highest value) because it is most affected by extreme outliers in the tail.
Explanation: ***85 – 125 mg/dL*** - This range is calculated using the **Empirical Rule** ($\text{Mean} \pm 2 \text{ SD}$), which states that approximately 95% of observations in a **normal distribution** fall within two standard deviations of the mean. - Calculation: $105 \text{ mg/dL} \pm (2 \times 10 \text{ mg/dL}) = 105 \pm 20 \text{ mg/dL}$, resulting in the range **85 – 125 mg/dL**. *101 – 110 mg/dL* - This range is too narrow, only covering values $5 \text{ mg/dL}$ above and below the mean, and does not represent the required **95%** coverage for a 10 mg/dL standard deviation. - Using this small range indicates an incorrect application of the **standard deviation** multiplier necessary for determining large confidence intervals. *90 – 125 mg/dL* - While the upper limit ($125 \text{ mg/dL}$) is correct ($\text{Mean} + 2 \text{ SD}$), the lower limit ($90 \text{ mg/dL}$) is incorrect, as it must be symmetrical around the mean in a **normal distribution**. - This asymmetrical range does not accurately represent the **95% confidence interval** defined by $\text{Mean} \pm 2 \text{ SD}$. *95 – 115 mg/dL* - This range is calculated using $\text{Mean} \pm 1 \text{ SD}$ ($105 \pm 10 \text{ mg/dL}$), which only includes approximately **68%** of the data according to the **Empirical Rule**, not 95%. - To capture **95%** of the population data, clinicians and students must use **two standard deviations** from the mean.
Explanation: ***Unpaired t-test***- It is the most appropriate statistical test used to compare the means of two independent (unrelated) groups when the data is continuous (like **blood pressure**).- This test assesses the null hypothesis that there is no significant difference between the **population means** of the two comparison groups.*Paired t-test*- This test is specifically designed to compare means when the observations are dependent, meaning the data comes from the **same subjects** measured twice (e.g., pre-treatment and post-treatment).- It is used for **within-group comparisons** rather than comparisons between two independent cohorts, as requested in the scenario.*Chi-square test*- The chi-square test is used to determine the association between **two categorical variables** (e.g., proportions or frequencies).- It is unsuitable here because the variable being compared (blood pressure) is **continuous data**, and the study requires comparing means, not counted frequencies.*ANOVA*- ANOVA (Analysis of Variance) is used when comparing the means of **three or more** independent groups.- While acceptable for two groups (where it gives equivalent results to the t-test), the **unpaired t-test** is the most specific and standard test for comparing means of exactly two independent samples.
Explanation: ***79-85***- For a **normal distribution**, the range covering two standard deviations (2 SD) is calculated using the formula: **Mean $\pm$ (2 $\times$ SD)**. The $2\sigma$ interval encompasses approximately **95.45%** of the data points.- Calculation: Lower limit $= 82 - (2 \times 1.5) = 82 - 3 = 79$. Upper limit $= 82 + (2 \times 1.5) = 85$. The correct range is **79-85**. *60-68*- This range is highly incorrect as it is centered far below the **mean of 82** and the width (8 units) is too wide for a total of 3 units (2 SD) dispersion. - The lower limit of 60 is over 14 standard deviations away from the mean, indicating it is an outlier range not relevant to the $2\sigma$ calculation. *50-57*- This range is excessively far from the **mean of 82** and its width (7 units) does not correspond to the required 3 units of dispersion needed for $\pm 2$ SD. - Ranges like this would include virtually none of the observations expected in a population with a mean of 82 and a small **standard deviation** of 1.5. *40-49*- This interval is centered around 44.5, which is highly divergent from the actual **mean of 82**, and therefore cannot represent the population's $\pm 2$ SD range. - In a normal distribution, the data is symmetric around the mean; any calculated range must therefore include the mean near its center, which this option fails to do.
Explanation: ***96%*** - **Specificity** is the ability of a test to correctly identify those *without* the disease (True Negatives) among all disease-free individuals: Specificity = TN / (TN + FP) - Given data: Total patients = 2000; Actual HIV positive = 200; Actual HIV negative = 1800 - Test showed 260 positives, of which 130 were true positives (TP) - False Positives (FP) = 260 - 130 = 130 - True Negatives (TN) = Total negatives - FP = 1800 - 130 = 1670 - **Calculated Specificity = 1670/1800 × 100 = 92.78%** - Among the given options, **96% is the closest** to the calculated value of 92.78% *80%* - This value is too low and does not match the calculated specificity - This might represent a miscalculation or confusion with sensitivity *72%* - This is significantly lower than the actual specificity of 92.78% - This does not correspond to any standard epidemiological measure from the given data *68%* - This is the lowest option and far from the correct calculation - This may result from calculation errors such as using wrong denominators or confusing different test parameters
Explanation: ***Correct: 1000*** - The **Infant Mortality Rate (IMR)** is standardly calculated as the number of deaths of infants under one year of age per **1000 live births** in a given population and time period - This denominator (per **1000 live births**) is the international standard adopted by organizations like the **WHO** for standardized calculation and comparison of vital rates - IMR is expressed as deaths per 1000 live births, making it directly comparable across different populations and time periods *Incorrect: 100* - A denominator of **100** is used when expressing a rate as a **percentage**, which is not the conventional methodology for reporting IMR - Using 100 as the denominator would convert the IMR into a percentage, which is not conducive to reliable international comparisons - Standard vital statistics use 1000 as the base denominator *Incorrect: 10,000* - A denominator of **10,000** is occasionally used for reporting rates of very specific, **less common** public health events or diseases - It is **not** the traditional choice for IMR; standard indices of mortality (like Crude Death Rate, Birth Rate, IMR) rely on a base of **1000** *Incorrect: 1,00,000* - A denominator of **1,00,000** (one lakh) is primarily used when calculating incidence or prevalence of extremely **rare diseases** or specific morbidity rates in large populations - While it provides larger whole numbers, it violates the conventional rule that major vital statistics rates (like IMR) use **1000** as the denominator
Explanation: ***Stratified random sampling.***- This technique divides the population (Delhi area) into homogeneous subgroups (strata) based on the defining characteristic, which in this case is **religion**, to ensure proportional representation. - Since dietary habits are likely to vary significantly across different religious groups, stratification ensures that the study sample accurately reflects the **dietary heterogeneity** of the urban area. *Cluster random sampling*- **Cluster sampling** is typically used when the population is large and geographically dispersed; the basic unit sampled is a group (cluster), not the individual.- Selecting entire geographical clusters might not capture the full diversity of religious dietary habits, potentially leading to increased **sampling error**. *Simple random sampling*- **Simple random sampling** selects individuals purely randomly, irrespective of their subgroup (religious) membership.- This method risks selecting an inadequate number of individuals from smaller religious groups, thereby failing to accurately represent the **dietary practices** of the entire population. *Systematic random sampling*- **Systematic sampling** involves selecting every 'n'th member from a list and is logistically simple, but it does not account for the intrinsic heterogeneity (religion) of the population.- If the initial list is arranged in a pattern related to religious groups, this method could introduce a **hidden bias**, compromising the representativeness of the sample.
Explanation: ***Paired t-test*** - This test is appropriate for comparing the means of **two related samples** or measurements taken from the **same subjects** at two different time points (before and after intervention). - The study involves recording the mean weight of the *same* diabetic patients before and after a 6-month dietary intervention, making the samples dependent (paired). *Unpaired t-test* - The unpaired t-test (or Student's t-test) is used to compare the means of **two independent (unrelated) groups** (e.g., comparing the mean weight of patients in Group A vs. Group B). - It is unsuitable here because the measurements are taken from the same set of individuals, meaning the data points are related, not independent. *ANOVA* - **Analysis of Variance (ANOVA)** is used to compare the means of **three or more** independent groups (e.g., comparing mean weight across three different regions). - It is used when there are multiple levels of a factor or multiple independent variables, which is not the case when comparing two time points. *Chi-square test* - The Chi-square test is primarily used to analyze **categorical data** (frequencies or proportions) to determine if there is a significant association between two variables (e.g., relationship between gender and diabetes status). - It is unsuitable for comparing numerical values like mean weight measurements, which are continuous data.
Explanation: ***Mean, median - Dispersion*** - This statement is **incorrect** because the **mean** and **median** are measures of **central tendency** (location) of a distribution, not dispersion. - Measures of dispersion quantify the spread of data, such as **standard deviation**, range, and interquartile range. ***Standard error - Variation*** - **Standard error** is a measure of the **variation** (or dispersion) of sample means around the true population mean, making this a correct match. - Specifically, it estimates how much the sample mean is likely to deviate from the population mean. ***Correlation coefficient - Relationship*** - The **correlation coefficient** (e.g., Pearson's r) measures the **strength and direction of the linear relationship** between two variables, making this a correct match. - Its value ranges from -1 (perfect negative relationship) to +1 (perfect positive relationship). ***Moments - Skewness*** - **Moments** are specific mathematical calculations used to describe the shape and characteristics of a distribution; the **third moment** is specifically used to calculate **skewness**. - **Skewness** describes the asymmetry of the distribution (whether it leans left or right), and the third moment helps quantify this.
Collection and Presentation of Data
Practice Questions
Measures of Central Tendency
Practice Questions
Measures of Dispersion
Practice Questions
Normal Distribution
Practice Questions
Sampling Methods
Practice Questions
Sample Size Calculation
Practice Questions
Hypothesis Testing
Practice Questions
Tests of Significance
Practice Questions
Correlation and Regression
Practice Questions
Survival Analysis
Practice Questions
Multivariate Analysis
Practice Questions
Statistical Software in Research
Practice Questions
Get full access to all questions, explanations, and performance tracking.
Start For Free