Which of the following statements about Simple Random Sampling is true?
Which of the following graphical methods is best for studying the decline in the percentage of alcohol usage in men and women over several years?
Which of the following is NOT a probability sampling method?
Which of the following statements is true regarding Positive Predictive Value (PPV)?
Which variation occurs due to different diseases being treated in different hospitals?
If the mean, median, and mode are 10, 18, and 26 respectively, what kind of distribution is it?
How can the power of a study be increased?
What is a measure of dispersion?
Standard error of the mean indicates what?
Child survival index is calculated by?
Explanation: ### Explanation **Simple Random Sampling (SRS)** is the most basic form of probability sampling. The fundamental principle of SRS is that **every individual element in the population has an equal and independent chance** of being selected for the study. This eliminates selection bias, ensuring that the sample is representative of the population from which it is drawn. #### Analysis of Options: * **Option A (Correct):** This is the defining characteristic of SRS. By using methods like a lottery system or computer-generated random number tables, each unit in the sampling frame has a probability of $1/N$ (where $N$ is the population size) of being chosen. * **Option B (Incorrect):** Sampling based on similar characteristics refers to **Stratified Random Sampling**, where the population is divided into homogenous subgroups (strata) before sampling. * **Option C (Incorrect):** SRS is best suited for **small, homogeneous populations**. For large, heterogeneous populations, Stratified or Cluster sampling is more efficient and practical. * **Option D (Incorrect):** A major prerequisite for SRS is a **complete and up-to-date sampling frame** (a list of all individuals in the population). Without this list, random assignment is impossible. #### NEET-PG High-Yield Pearls: * **Gold Standard:** SRS is the theoretical "gold standard" for representativeness, but it is often difficult to implement in large-scale field studies. * **Randomization vs. Random Sampling:** Remember that *Random Sampling* ensures external validity (generalizability), while *Randomization* (Random Allocation) ensures internal validity by eliminating confounding. * **Methods of SRS:** Lottery method, Tippett’s random number table, and computer-generated sequences. * **Sampling Error:** In SRS, the sampling error can be calculated easily using the standard error formula.
Explanation: ### Explanation **1. Why the Correct Answer is Right:** The **Line Diagram** (or Line Graph) is the most suitable method for representing **trends over time**. In biostatistics, when we need to observe the progression, decline, or fluctuation of a variable (like alcohol usage percentage) across several years, a line diagram effectively connects data points to show the direction of change. It is particularly useful for comparing two or more groups (men vs. women) on the same axes, allowing for a clear visual comparison of their respective trends. **2. Why Other Options are Incorrect:** * **Pie Chart:** This is used to show the **proportional distribution** of a single variable at a specific point in time (e.g., the share of different types of substances used). It cannot represent changes over a time series. * **Histogram:** This is used for **continuous quantitative data** to show frequency distribution within a single group. It consists of adjacent rectangles where the area represents the frequency. It is not designed to show trends over years. * **Frequency Polygon:** This is a derivative of the histogram, created by joining the midpoints of the tops of the histogram bars. While it shows distribution, it is used for frequency data, not for tracking a percentage trend over a chronological period. **3. Clinical Pearls & High-Yield Facts for NEET-PG:** * **Trend over time:** Always choose **Line Diagram**. * **Correlation between two variables:** Always choose **Scatter Diagram**. * **Comparison of discrete/qualitative data:** Use a **Bar Chart**. * **Frequency distribution of continuous data:** Use a **Histogram**. * **To find Median graphically:** Use an **Ogive** (Cumulative frequency curve). * **Pictogram:** Uses images to represent data; easiest for a layperson to understand but least accurate.
Explanation: ### Explanation In biostatistics, sampling methods are broadly categorized into **Probability** (Random) and **Non-probability** (Non-random) sampling. **Why Quota Sampling is the Correct Answer:** Quota sampling is a **non-probability sampling method**. In this technique, the population is divided into strata (e.g., age, gender), and the researcher is assigned a specific "quota" to fill from each group. However, unlike stratified random sampling, the selection of individuals within these quotas is done through **convenience or judgment** rather than random selection. Because every member of the population does not have a known, non-zero chance of being selected, it is not a probability method. **Analysis of Incorrect Options:** * **A. Simple Random Sampling:** The "gold standard" of probability sampling where every individual has an equal and independent chance of being selected (e.g., using a lottery method or random number table). * **B. Systematic Random Sampling:** A probability method where the first unit is selected randomly, and subsequent units are chosen at fixed intervals (every $k^{th}$ unit). It is often used in OPD settings. * **C. Cluster Sampling:** A probability method where the population is divided into "clusters" (e.g., villages or wards), and entire clusters are randomly selected. This is the method used by the WHO for Expanded Programme on Immunization (EPI) coverage surveys (30 x 7 cluster technique). **High-Yield Clinical Pearls for NEET-PG:** * **Non-probability methods** include: Quota, Convenience (Accidental), Purposive (Judgmental), and Snowball sampling. * **Snowball sampling** is the method of choice for "hidden populations" (e.g., IV drug users, commercial sex workers). * **Multistage sampling** is the most commonly used method in large-scale national health surveys in India (like NFHS). * **Sampling Error** occurs only in probability sampling; non-probability sampling is prone to **Selection Bias**.
Explanation: ### Explanation **1. Why Option A is Correct:** Positive Predictive Value (PPV) is the probability that a person who tests positive actually has the disease. Mathematically, it is calculated as: $$PPV = \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}}$$ The relationship between PPV and prevalence is **directly proportional**. As the prevalence of a disease in a population increases, the number of "True Positives" increases while the number of "False Positives" decreases (relative to the total positive results). Consequently, the numerator grows faster than the denominator, leading to a higher PPV. In simpler terms, a positive test result is much more likely to be a "true" case in a high-risk population than in a low-risk one. **2. Why Other Options are Incorrect:** * **Option B & C:** These are mathematically incorrect. PPV and Negative Predictive Value (NPV) are the two primary validity indicators that are **dependent** on the prevalence of the disease. * **Option D:** While PPV changes with prevalence, it does not follow a simple "doubling" rule. The relationship is non-linear and depends on the fixed sensitivity and specificity of the test. **3. NEET-PG High-Yield Clinical Pearls:** * **Prevalence vs. Predictive Values:** * ↑ Prevalence = ↑ PPV and ↓ NPV. * ↓ Prevalence = ↓ PPV and ↑ NPV. * **Sensitivity & Specificity:** Unlike predictive values, Sensitivity and Specificity are **inherent properties** of a diagnostic test and do not change with disease prevalence. * **Screening Strategy:** To maximize PPV in clinical practice, screening should be targeted at **high-risk populations** (where prevalence is higher) rather than the general population. * **Formula for PPV (Bayes' Theorem context):** $$PPV = \frac{\text{Sensitivity} \times \text{Prevalence}}{(\text{Sensitivity} \times \text{Prevalence}) + (1 - \text{Specificity}) \times (1 - \text{Prevalence})}$$
Explanation: **Explanation:** **Berksonian Bias (Admission Rate Bias)** occurs when a study is conducted using hospital-based populations rather than the general community. It arises because patients with multiple diseases (comorbidities) are more likely to be admitted to a hospital than those with only one. If different hospitals specialize in different diseases, the association between an exposure and a disease may be artificially distorted (either strengthened or weakened) because the "hospitalized" sample does not represent the true distribution in the general population. **Analysis of Incorrect Options:** * **Neyman Bias (Prevalence-Incidence Bias):** This occurs when there is a gap between the onset of a disease and the selection of study subjects. It typically excludes patients who die early or recover quickly, leading to a sample of "survivors" (prevalent cases) rather than all incident cases. * **Attention Bias (Hawthorne Effect):** This is a change in behavior by the study participants because they are aware they are being observed or studied. * **Recall Bias:** Common in case-control studies, this occurs when cases remember past exposures more accurately or differently than healthy controls. **Clinical Pearls for NEET-PG:** * **Berksonian Bias** is a type of **Selection Bias**. * To minimize this bias, researchers should ideally use **community-based samples** rather than hospital-based samples. * **Key Trigger Words:** "Hospital-based study," "Different admission rates," or "Multiple comorbidities" usually point toward Berksonian bias. * **Neyman Bias** is most commonly associated with **Cross-sectional studies** involving chronic diseases.
Explanation: ### Explanation **1. Why the Correct Answer is Right:** In biostatistics, the relationship between the **Mean, Median, and Mode** determines the shape of a frequency distribution curve. * In this question: **Mean (10) < Median (18) < Mode (26).** * When the mean is pulled toward the lower values (the left side), it indicates a **Negatively Skewed Distribution** (also known as "Left-skewed"). * The "tail" of the graph points toward the smaller numbers (negative side of the X-axis). This happens when there are a few extremely low values that drag the mean down, while the majority of data points are clustered at the higher end. **2. Why the Incorrect Options are Wrong:** * **Symmetric / Normal Distribution (Options A & B):** In a perfectly symmetrical or Normal (Gaussian) distribution, the **Mean = Median = Mode**. Since 10 ≠ 18 ≠ 26, the distribution is asymmetrical. * **Positively Skewed (Option C):** In a positively skewed distribution (Right-skewed), the relationship is reversed: **Mean > Median > Mode**. The tail points toward the higher values (positive side), usually due to a few extremely high outliers. **3. NEET-PG Clinical Pearls & High-Yield Facts:** * **The "Alphabetical Rule":** To remember the order in a **Positively Skewed** distribution, follow the alphabet: **M**ean > **M**edian > **M**ode (alphabetical order of the second letters: e > i > o). * **Median's Position:** In any skewed distribution (positive or negative), the **Median always stays in the middle** between the Mean and the Mode. * **Best Measure of Central Tendency:** * For **Normal Distribution**: Mean is the best measure. * For **Skewed Distribution**: Median is the best measure (as it is not affected by extreme outliers). * **Formula (Karl Pearson’s):** $Mode = (3 \times Median) - (2 \times Mean)$. This is often used to calculate a missing value in NEET-PG numericals.
Explanation: ### Explanation In biostatistics, the **Power of a Study** is defined as the probability that the study will correctly reject a null hypothesis when it is false (i.e., the ability to detect a true difference or effect). **1. Why the Correct Answer is Right:** Mathematically, **Power = 1 – β (Beta error)**. * **Beta (Type II) error** occurs when a researcher fails to reject a null hypothesis that is actually false (a "false negative"). * Since Power and Beta error are inversely related, **decreasing the Beta error** directly increases the Power of the study. A study with a Beta error of 0.20 (20%) has a Power of 0.80 (80%). **2. Why the Incorrect Options are Wrong:** * **Option A & C (Alpha Error):** Alpha (Type I) error is the probability of rejecting a null hypothesis when it is actually true (a "false positive"). While decreasing alpha makes a study more stringent, it actually *increases* the risk of a Type II error, thereby potentially *decreasing* power. * **Option D (Increasing Beta Error):** Increasing the Beta error means the study is more likely to miss a true effect, which mathematically reduces the Power. **3. NEET-PG High-Yield Clinical Pearls:** * **Sample Size:** The most common practical way to increase the power of a study in clinical research is to **increase the sample size**. * **Standard Power:** In most medical research, a power of **80% (0.8)** is considered the minimum acceptable level. * **Determinants of Power:** Power is influenced by the sample size, the effect size (magnitude of difference), the significance level (alpha), and the variance (standard deviation) in the data. * **Type I Error (α):** "Finding a difference when none exists." * **Type II Error (β):** "Missing a difference that actually exists."
Explanation: **Explanation:** In biostatistics, data is summarized using two primary types of measures: **Measures of Central Tendency** (averages) and **Measures of Dispersion** (variability). **Why Standard Deviation is Correct:** **Standard Deviation (SD)** is the most commonly used measure of dispersion in medical research. It quantifies the extent of variation or "spread" of data points around the arithmetic mean. A low SD indicates that the data points tend to be close to the mean, while a high SD indicates that the data is spread out over a wider range of values. It is the square root of the variance and is expressed in the same units as the original data. **Why the Other Options are Incorrect:** * **Mean (A):** This is the arithmetic average of all observations. It is a measure of central tendency, not dispersion. * **Mode (B):** This is the value that occurs most frequently in a dataset. It is a measure of central tendency used primarily for nominal data. * **Median (D):** This is the middle-most value when data is arranged in ascending or descending order. It is a measure of central tendency used especially when data is skewed. **High-Yield Facts for NEET-PG:** * **Measures of Dispersion include:** Range, Mean Deviation, Standard Deviation, and Coefficient of Variation. * **Relative Measure of Dispersion:** The **Coefficient of Variation** is used to compare the variability of two different series (e.g., comparing height in cm vs. weight in kg). * **Normal Distribution (Gaussian Curve):** In a normal distribution, Mean = Median = Mode. * **The 68-95-99.7 Rule:** In a normal distribution, Mean ± 1 SD covers 68% of values, Mean ± 2 SD covers 95%, and Mean ± 3 SD covers 99.7%.
Explanation: **Explanation:** **Standard Error of the Mean (SEM)** is a measure of the **deviation** of the sample mean from the true population mean. While Standard Deviation (SD) measures the spread of individual observations within a single sample, SEM measures the precision of the sample mean as an estimate of the population mean. 1. **Why "Deviation" is correct:** SEM is mathematically defined as the standard deviation of the sampling distribution of the mean. It quantifies how much the sample mean is likely to "deviate" from the actual population mean. A smaller SEM indicates that the sample mean is a more accurate reflection of the population mean. 2. **Why other options are incorrect:** * **Dispersion & Variation:** These are broad terms describing the spread of data. While SEM is a type of dispersion, these terms usually refer to **Standard Deviation (SD)** or **Variance**, which describe the spread of individual data points around their own mean, rather than the reliability of the mean itself. * **Distribution:** This refers to the overall pattern or shape of the data (e.g., Normal/Gaussian distribution) rather than a specific numerical measure of error or precision. **High-Yield NEET-PG Pearls:** * **Formula:** $SEM = \frac{SD}{\sqrt{n}}$ (where $n$ is the sample size). * **Relationship:** As the sample size ($n$) increases, the SEM decreases, making the estimate more precise. * **Clinical Application:** SEM is primarily used to calculate **Confidence Intervals (CI)**. * **Key Distinction:** Use **SD** to describe the variability of a biological characteristic (e.g., BP in a group); use **SEM** to describe the uncertainty of the mean estimate in a study.
Explanation: ### Explanation **1. Why the Correct Answer is Right:** The **Child Survival Index (CSI)** is a health indicator used to represent the probability of a child surviving until their fifth birthday. It is derived from the **Under-5 Mortality Rate (U5MR)**, which is the number of deaths per 1,000 live births before age five. The formula is: $$\text{Child Survival Index} = \frac{1000 - \text{U5MR}}{10}$$ * **Logic:** Subtracting U5MR from 1,000 gives the number of survivors out of 1,000 live births. Dividing by 10 converts this figure into a **percentage (%)**, making it a standardized index for comparing regional health performance. **2. Why Incorrect Options are Wrong:** * **Options A & B:** These use the **Infant Mortality Rate (IMR)**. While IMR (deaths before age 1) is a sensitive indicator of socio-economic development, the Child Survival Index specifically focuses on the "Under-5" milestone, which reflects broader factors like nutrition and immunization. * **Option D:** This formula results in a negative number (e.g., $50 - 1000 = -950$), which is mathematically incorrect for calculating a survival index. **3. High-Yield Facts for NEET-PG:** * **U5MR vs. IMR:** U5MR is considered the best single indicator of social development and well-being rather than just health status. * **Child Survival Index:** It was a key metric used in the **Child Survival and Safe Motherhood (CSSM)** program launched in India (1992). * **Indicator of Choice:** For monitoring the progress of Millennium Development Goals (MDGs) and now Sustainable Development Goals (SDGs), U5MR is the preferred indicator. * **Current Trend:** As of recent NFHS data, India has seen a significant decline in U5MR, though regional disparities remain.
Collection and Presentation of Data
Practice Questions
Measures of Central Tendency
Practice Questions
Measures of Dispersion
Practice Questions
Normal Distribution
Practice Questions
Sampling Methods
Practice Questions
Sample Size Calculation
Practice Questions
Hypothesis Testing
Practice Questions
Tests of Significance
Practice Questions
Correlation and Regression
Practice Questions
Survival Analysis
Practice Questions
Multivariate Analysis
Practice Questions
Statistical Software in Research
Practice Questions
Get full access to all questions, explanations, and performance tracking.
Start For Free