In statistics, centiles and quartiles are considered as:
In a study done in a hospital, patients were categorized into three groups based on disease prevalence (Low, Medium, High), and individuals were then randomly selected from each group. What type of sampling is this?
A study is conducted to compare the mean hemoglobin (Hb) levels between two independent groups. Which statistical test is most appropriate?
The given image shows:

The given image shows:

Which type of statistical graph is shown in the image below?

The following scatter plot of 4 different samples shows the correlation between weight and height in the samples. All 4 samples have the same coefficient of correlation of 0.6 taken together, what will be the net correlation coefficient?

The following diagram shows:

The person on the upper part of the image is an IV drug abuser who is undergoing rehabilitation. The people shown are either sexual contacts or person with whom he had shared a needle. From those contacts further information leads to multiple such drug addicts in the community who had not sought medical services till now. This is an example of:

This is a map of an urban slum in New Delhi. The medical officer of PHC notified the higher authorities on observing multiple cases of measles in 1-2 years age group. The authorities conducted a survey of vaccine coverage in the area in the manner shown below. Which sampling is depicted here?

Explanation: ***Measures of location/position***- Centiles (or **percentiles**) and **quartiles** are statistics that divide the data distribution into equal parts, indicating where a particular value stands relative to the rest of the data.- They are also known as **quantiles**, used to describe the location of specific data points within the distribution rather than summarizing the center or spread.*Measures of central tendency*- These statistics aim to describe the typical or **central value** of a dataset (e.g., **Mean**, **Median**, **Mode**).- While the median is technically the second quartile (**Q2**) and the 50th centile, the classifications of centiles and quartiles collectively are broader—measures of position.*Measures of dispersion*- These measures quantify the **spread** or **variability** of the data around the central value (e.g., **Standard Deviation**, **Variance**, Range).- Although quartiles are essential for calculating the **Interquartile Range (IQR)**, which is a measure of dispersion, the quartiles themselves define points of position.*Measures of correlation*- Correlation measures describe the **linear relationship** or association between **two variables** (e.g., Correlation Coefficient, R-value).- They are used in bivariate analysis and have no role in describing the position or central value of a single dataset.
Explanation: ***Correct: Stratified random sampling*** - This method involves dividing the population into non-overlapping subgroups (**strata**) based on a characteristic (here, disease prevalence: Low, Medium, High). - Subsequently, a **simple random sample** is drawn from *each* stratum independently to ensure representation from all groups. - This ensures that each subgroup is adequately represented in the final sample, making it ideal when the population has distinct subgroups. *Incorrect: Simple random sampling* - Every individual in the entire population has an equal and independent chance of being selected. - It does not involve dividing the population into specific subgroups or categories before selection. - This method may underrepresent or overrepresent certain subgroups by chance. *Incorrect: Systematic random sampling* - This involves selecting every *k*th element after a random start point, where *k* is the sampling interval (Population Size/Sample Size). - Like simple random sampling, it does not involve creating predefined strata based on characteristics like disease prevalence. - It's a simpler alternative to simple random sampling but doesn't ensure representation of specific subgroups. *Incorrect: Cluster random sampling* - The population is divided into natural groupings (**clusters**), such as geographical areas or schools. - Unlike stratification, entire clusters are randomly selected, and *all* individuals within the selected clusters (or a random sample thereof) are included in the study. - This differs from stratified sampling where we sample from ALL strata; in cluster sampling, we sample only SOME clusters.
Explanation: ***Unpaired t-test*** - This test is specifically used to compare the **means** of a continuous outcome variable (like hemoglobin level) between **two independent, unrelated groups**. - It is based on the assumption that the data is normally distributed and variances are equal (though modifications exist if variances are unequal, known as Welch's t-test). *Paired t-test* - The paired t-test is used when the data comes from **dependent** or **related groups**, such as measuring the same individuals before and after an intervention (pre-post study). - Since the question specifies two **independent** groups, this test is incorrect. *Chi-square test* - This test is used to analyze the association or difference between **two or more categorical variables** (e.g., comparing proportions or frequencies in nominal data). - It is unsuitable for comparing the **mean** of a **continuous variable** like Hb levels. *ANOVA* - Analysis of Variance (ANOVA) is used to compare the **means** of a continuous variable among **three or more independent groups**. - Since the study involves only **two groups**, the unpaired t-test is the simpler and more conventional choice, although ANOVA yields the same result when reduced to two groups.
Explanation: ***Funnel plot*** - This graph displays individual study results (trial size vs. effect size), forming a **funnel-like shape** when no publication bias is present. - The observed asymmetry in the plot, where many small trials with a larger effect size are on one side, suggests potential **publication bias**, indicating this is a funnel plot. *Kaplan Meier plot* - A Kaplan-Meier plot is used to estimate the **survival function** from lifetime data. - It displays the probability of an event (e.g., death, disease remission) over time, characterized by a **stepped curve**. *Spaghetti plot* - A spaghetti plot is used to visualize longitudinal data, showing **multiple individual trajectories** over time. - Each line in a spaghetti plot represents data from a **single subject** or entity across different time points. *Forest plot* - A forest plot graphically presents the results of individual studies included in a **meta-analysis** and the overall pooled estimate. - It typically shows **effect size** and **confidence intervals** for each study, often represented by squares and horizontal lines.
Explanation: **Funnel plot** - A **funnel plot** displays the **effect sizes** from individual studies against a measure of their precision (e.g., sample size or standard error), typically in the context of a meta-analysis. - The characteristic **funnel shape** (or inverted triangle) arises because smaller studies (lower precision) have wider confidence intervals and thus show more variability, while larger studies (higher precision) cluster more closely around the true effect size. *Kaplan Meier plot* - A **Kaplan-Meier plot** is used to estimate and display **survival probabilities** over time, particularly in clinical trials or observational studies. - It shows a stepwise-decreasing curve as events (e.g., death, disease recurrence) occur, which is not what is depicted in the image. *Spaghetti plot* - A **spaghetti plot** displays the **individual trajectories** of multiple subjects over time, often used to visualize longitudinal data. - Each line in a spaghetti plot represents a single subject's path, which is distinct from the scattered points representing individual studies on the given image. *Forest plot* - A **forest plot** is a graphical display used in **meta-analyses** to present the results of individual studies and their combined effect. - It typically shows the effect estimate and confidence interval for each study as horizontal lines, with a diamond representing the overall pooled estimate, which is different from the funnel shape seen here.
Explanation: ***Frequency polygon*** - A frequency polygon is constructed by plotting a **point at the midpoint of each class interval** at a height corresponding to its frequency, and then **connecting these points with straight lines**. - The image clearly shows points plotted and connected by straight lines, representing the frequency distribution, which is characteristic of a frequency polygon. *Histogram* - A histogram uses **contiguous bars to represent the frequency distribution** of continuous data, where the width of the bar represents the class interval and the height represents the frequency. - The image does not display bars, but rather a line graph connecting points, which differentiates it from a histogram. *Sector diagram* - A sector diagram, also known as a **pie chart**, divides a circle into sectors that represent proportions of a whole. - The image is a two-dimensional graph with x and y axes, not a circular representation of proportions. *Scatter diagram* - A scatter diagram displays **individual data points** for two variables to show their relationship, without connecting them with lines in a continuous manner like a frequency polygon. - While it shows points, they are connected to form a shape, indicating a distribution over intervals rather than individual data points of two distinct variables.
Explanation: ***More than 0.6*** - When multiple distinct samples (each with r = 0.6) are combined, the **overall correlation coefficient typically increases** beyond the individual correlations - This occurs because the **between-group variance** adds to the total variance when groups form separate clusters along the regression line - The combined dataset captures both the within-group correlation (0.6) and the systematic separation of groups, resulting in a stronger overall linear relationship - This is a well-recognized phenomenon in biostatistics: **pooled correlation > individual correlations when groups are spatially separated** *Less than 0.6* - This would occur only if the groups showed contradictory trends or if combining them obscured the linear relationship - Not applicable here since all groups show the same positive correlation and align along a common trend *Equal to 0.6* - This is incorrect because **the correlation coefficient of pooled data ≠ mean of individual correlations** - The pooled correlation is calculated from all combined data points, which includes both within-group and between-group variance - Mathematical property: when distinct clusters exist, pooling increases the correlation coefficient *Cannot be calculated* - The correlation coefficient **can be calculated** by pooling all data points and applying the standard Pearson correlation formula - Sufficient information exists to compute the combined correlation from the aggregated dataset
Explanation: ***Disease Triangle*** - This diagram illustrates the **Disease Triangle**, a model representing the interaction of three factors essential for disease to occur: a **host**, a **pathogen (agent)**, and a **stressful environment**. - The shaded central area, labeled "Disease occurs," signifies that disease manifests only when all three components overlap and contribute simultaneously. - This concept is fundamental in understanding **infectious disease epidemiology** and emphasizes that removing any one factor can prevent disease occurrence. *Incorrect: Epidemiological Triad* - While conceptually similar to the Disease Triangle, the term "Epidemiological Triad" typically refers to the same three components but without the specific triangular diagrammatic representation shown here. *Incorrect: Web of Causation* - The Web of Causation is a more complex epidemiological model that shows **multiple interrelated factors** contributing to disease, not just three distinct components. - It represents a network of factors rather than three overlapping circles. *Incorrect: Venn Diagram* - While this diagram uses Venn diagram format (overlapping circles), it specifically represents the **Disease Triangle model** in epidemiology, not just a general Venn diagram.
Explanation: ***Snowball sampling*** - This method involves **identifying initial subjects** (e.g., the IV drug abuser in rehabilitation) and then asking them to identify others within their network who fit the study criteria. - It is particularly useful for reaching **hidden or hard-to-reach populations**, such as IV drug abusers or individuals involved in stigmatized behaviors, as seen in this scenario where subsequent contacts lead to more unreached individuals. *Cluster sampling* - This method involves dividing the population into **clusters** (e.g., geographic areas) and then randomly sampling entire clusters. - It is not applicable here as the sampling is based on personal connections, not pre-defined groups or locations. *Stratified cluster sampling* - This is a combination of stratified and cluster sampling, where the population is first divided into **strata**, and then clusters are randomly sampled from each stratum. - This method is more complex and typically used when there's a need to ensure representation from specific subgroups within clusters, which is not the primary technique described. *Convenience sampling* - This method involves selecting participants who are **readily available** or easiest to access. - While the initial contact might seem convenient, the extended chain of referrals to find more individuals goes beyond mere convenience and represents a deliberate effort to leverage existing social networks.
Explanation: ***Cluster sampling*** - The image shows **groups (clusters)** of houses (red houses within red circles) being selected, and then all units within those selected groups are included in the sample. - This method is typically used when the population is naturally divided into groups, such as geographical areas or blocks, making it **cost-effective** and practical, especially in large, dispersed populations like an urban slum. *Simple random sampling* - This method would involve **randomly selecting individual houses** from the entire slum without any pre-defined grouping, which is not depicted in the image. - Each house would have an **equal chance of being selected**, and sampling would not be restricted to specific clusters. *Systematic random sampling* - Involves selecting houses at a **fixed interval** (e.g., every 5th house) from a sorted list or along a defined path after a random starting point. - The image does not show a systematic selection process or an underlying order for sampling the houses. *Stratified random sampling* - This method involves **dividing the population into homogeneous subgroups** (strata) based on a characteristic (e.g., age, income level) and then drawing a random sample from each stratum. - While the map shows 'sections', these are not necessarily strata based on a relevant characteristic, and the sampling is not shown to be proportional or disproportional across these sections.
Collection and Presentation of Data
Practice Questions
Measures of Central Tendency
Practice Questions
Measures of Dispersion
Practice Questions
Normal Distribution
Practice Questions
Sampling Methods
Practice Questions
Sample Size Calculation
Practice Questions
Hypothesis Testing
Practice Questions
Tests of Significance
Practice Questions
Correlation and Regression
Practice Questions
Survival Analysis
Practice Questions
Multivariate Analysis
Practice Questions
Statistical Software in Research
Practice Questions
Get full access to all questions, explanations, and performance tracking.
Start For Free