Biostatistics Practice Questions

Q: In statistics, centiles and quartiles are considered as:

Measures of location/position. ***Measures of location/position***- Centiles (or **percentiles**) and **quartiles** are statistics that divide the data distribution into equal parts, indicating where a particular value stands relative to the rest of the data.- They are also known as **quantiles**, used to describe the location of specific data points within the distribution rather than summarizing the center or spread.*Measures of central tendency*- These statistics aim to describe the typical or **central value** of a dataset (e.g., **Mean**, **Median**, **Mode**).- While the median is technically the second quartile (**Q2**) and the 50th centile, the classifications of centiles and quartiles collectively are broader—measures of position.*Measures of dispersion*- These measures quantify the **spread** or **variability** of the data around the central value (e.g., **Standard Deviation**, **Variance**, Range).- Although quartiles are essential for calculating the **Interquartile Range (IQR)**, which is a measure of dispersion, the quartiles themselves define points of position.*Measures of correlation*- Correlation measures describe the **linear relationship** or association between **two variables** (e.g., Correlation Coefficient, R-value).- They are used in bivariate analysis and have no role in describing the position or central value of a single dataset.

Q: In a study done in a hospital, patients were categorized into three groups based on disease prevalence (Low, Medium, High), and individuals were then randomly selected from each group. What type of sampling is this?

Stratified random sampling. ***Correct: Stratified random sampling*** - This method involves dividing the population into non-overlapping subgroups (**strata**) based on a characteristic (here, disease prevalence: Low, Medium, High). - Subsequently, a **simple random sample** is drawn from *each* stratum independently to ensure representation from all groups. - This ensures that each subgroup is adequately represented in the final sample, making it ideal when the population has distinct subgroups. *Incorrect: Simple random sampling* - Every individual in the entire population has an equal and independent chance of being selected. - It does not involve dividing the population into specific subgroups or categories before selection. - This method may underrepresent or overrepresent certain subgroups by chance. *Incorrect: Systematic random sampling* - This involves selecting every *k*th element after a random start point, where *k* is the sampling interval (Population Size/Sample Size). - Like simple random sampling, it does not involve creating predefined strata based on characteristics like disease prevalence. - It's a simpler alternative to simple random sampling but doesn't ensure representation of specific subgroups. *Incorrect: Cluster random sampling* - The population is divided into natural groupings (**clusters**), such as geographical areas or schools. - Unlike stratification, entire clusters are randomly selected, and *all* individuals within the selected clusters (or a random sample thereof) are included in the study. - This differs from stratified sampling where we sample from ALL strata; in cluster sampling, we sample only SOME clusters.

Q: A study is conducted to compare the mean hemoglobin (Hb) levels between two independent groups. Which statistical test is most appropriate?

Unpaired t-test. ***Unpaired t-test*** - This test is specifically used to compare the **means** of a continuous outcome variable (like hemoglobin level) between **two independent, unrelated groups**. - It is based on the assumption that the data is normally distributed and variances are equal (though modifications exist if variances are unequal, known as Welch's t-test). *Paired t-test* - The paired t-test is used when the data comes from **dependent** or **related groups**, such as measuring the same individuals before and after an intervention (pre-post study). - Since the question specifies two **independent** groups, this test is incorrect. *Chi-square test* - This test is used to analyze the association or difference between **two or more categorical variables** (e.g., comparing proportions or frequencies in nominal data). - It is unsuitable for comparing the **mean** of a **continuous variable** like Hb levels. *ANOVA* - Analysis of Variance (ANOVA) is used to compare the **means** of a continuous variable among **three or more independent groups**. - Since the study involves only **two groups**, the unpaired t-test is the simpler and more conventional choice, although ANOVA yields the same result when reduced to two groups.

Q: The given image shows:

Funnel plot. ***Funnel plot*** - This graph displays individual study results (trial size vs. effect size), forming a **funnel-like shape** when no publication bias is present. - The observed asymmetry in the plot, where many small trials with a larger effect size are on one side, suggests potential **publication bias**, indicating this is a funnel plot. *Kaplan Meier plot* - A Kaplan-Meier plot is used to estimate the **survival function** from lifetime data. - It displays the probability of an event (e.g., death, disease remission) over time, characterized by a **stepped curve**. *Spaghetti plot* - A spaghetti plot is used to visualize longitudinal data, showing **multiple individual trajectories** over time. - Each line in a spaghetti plot represents data from a **single subject** or entity across different time points. *Forest plot* - A forest plot graphically presents the results of individual studies included in a **meta-analysis** and the overall pooled estimate. - It typically shows **effect size** and **confidence intervals** for each study, often represented by squares and horizontal lines.

Q: The given image shows:

Funnel plot. **Funnel plot** - A **funnel plot** displays the **effect sizes** from individual studies against a measure of their precision (e.g., sample size or standard error), typically in the context of a meta-analysis. - The characteristic **funnel shape** (or inverted triangle) arises because smaller studies (lower precision) have wider confidence intervals and thus show more variability, while larger studies (higher precision) cluster more closely around the true effect size. *Kaplan Meier plot* - A **Kaplan-Meier plot** is used to estimate and display **survival probabilities** over time, particularly in clinical trials or observational studies. - It shows a stepwise-decreasing curve as events (e.g., death, disease recurrence) occur, which is not what is depicted in the image. *Spaghetti plot* - A **spaghetti plot** displays the **individual trajectories** of multiple subjects over time, often used to visualize longitudinal data. - Each line in a spaghetti plot represents a single subject's path, which is distinct from the scattered points representing individual studies on the given image. *Forest plot* - A **forest plot** is a graphical display used in **meta-analyses** to present the results of individual studies and their combined effect. - It typically shows the effect estimate and confidence interval for each study as horizontal lines, with a diamond representing the overall pooled estimate, which is different from the funnel shape seen here.

Q: Which type of statistical graph is shown in the image below?

Frequency polygon. ***Frequency polygon*** - A frequency polygon is constructed by plotting a **point at the midpoint of each class interval** at a height corresponding to its frequency, and then **connecting these points with straight lines**. - The image clearly shows points plotted and connected by straight lines, representing the frequency distribution, which is characteristic of a frequency polygon. *Histogram* - A histogram uses **contiguous bars to represent the frequency distribution** of continuous data, where the width of the bar represents the class interval and the height represents the frequency. - The image does not display bars, but rather a line graph connecting points, which differentiates it from a histogram. *Sector diagram* - A sector diagram, also known as a **pie chart**, divides a circle into sectors that represent proportions of a whole. - The image is a two-dimensional graph with x and y axes, not a circular representation of proportions. *Scatter diagram* - A scatter diagram displays **individual data points** for two variables to show their relationship, without connecting them with lines in a continuous manner like a frequency polygon. - While it shows points, they are connected to form a shape, indicating a distribution over intervals rather than individual data points of two distinct variables.

Q: The following scatter plot of 4 different samples shows the correlation between weight and height in the samples. All 4 samples have the same coefficient of correlation of 0.6 taken together, what will be the net correlation coefficient?

More than 0.6. ***More than 0.6*** - When multiple distinct samples (each with r = 0.6) are combined, the **overall correlation coefficient typically increases** beyond the individual correlations - This occurs because the **between-group variance** adds to the total variance when groups form separate clusters along the regression line - The combined dataset captures both the within-group correlation (0.6) and the systematic separation of groups, resulting in a stronger overall linear relationship - This is a well-recognized phenomenon in biostatistics: **pooled correlation > individual correlations when groups are spatially separated** *Less than 0.6* - This would occur only if the groups showed contradictory trends or if combining them obscured the linear relationship - Not applicable here since all groups show the same positive correlation and align along a common trend *Equal to 0.6* - This is incorrect because **the correlation coefficient of pooled data ≠ mean of individual correlations** - The pooled correlation is calculated from all combined data points, which includes both within-group and between-group variance - Mathematical property: when distinct clusters exist, pooling increases the correlation coefficient *Cannot be calculated* - The correlation coefficient **can be calculated** by pooling all data points and applying the standard Pearson correlation formula - Sufficient information exists to compute the combined correlation from the aggregated dataset

Q: The following diagram shows:

Disease Triangle. ***Disease Triangle*** - This diagram illustrates the **Disease Triangle**, a model representing the interaction of three factors essential for disease to occur: a **host**, a **pathogen (agent)**, and a **stressful environment**. - The shaded central area, labeled "Disease occurs," signifies that disease manifests only when all three components overlap and contribute simultaneously. - This concept is fundamental in understanding **infectious disease epidemiology** and emphasizes that removing any one factor can prevent disease occurrence. *Incorrect: Epidemiological Triad* - While conceptually similar to the Disease Triangle, the term "Epidemiological Triad" typically refers to the same three components but without the specific triangular diagrammatic representation shown here. *Incorrect: Web of Causation* - The Web of Causation is a more complex epidemiological model that shows **multiple interrelated factors** contributing to disease, not just three distinct components. - It represents a network of factors rather than three overlapping circles. *Incorrect: Venn Diagram* - While this diagram uses Venn diagram format (overlapping circles), it specifically represents the **Disease Triangle model** in epidemiology, not just a general Venn diagram.

Q: The person on the upper part of the image is an IV drug abuser who is undergoing rehabilitation. The people shown are either sexual contacts or person with whom he had shared a needle. From those contacts further information leads to multiple such drug addicts in the community who had not sought medical services till now. This is an example of:

Snowball sampling. ***Snowball sampling*** - This method involves **identifying initial subjects** (e.g., the IV drug abuser in rehabilitation) and then asking them to identify others within their network who fit the study criteria. - It is particularly useful for reaching **hidden or hard-to-reach populations**, such as IV drug abusers or individuals involved in stigmatized behaviors, as seen in this scenario where subsequent contacts lead to more unreached individuals. *Cluster sampling* - This method involves dividing the population into **clusters** (e.g., geographic areas) and then randomly sampling entire clusters. - It is not applicable here as the sampling is based on personal connections, not pre-defined groups or locations. *Stratified cluster sampling* - This is a combination of stratified and cluster sampling, where the population is first divided into **strata**, and then clusters are randomly sampled from each stratum. - This method is more complex and typically used when there's a need to ensure representation from specific subgroups within clusters, which is not the primary technique described. *Convenience sampling* - This method involves selecting participants who are **readily available** or easiest to access. - While the initial contact might seem convenient, the extended chain of referrals to find more individuals goes beyond mere convenience and represents a deliberate effort to leverage existing social networks.

Q: This is a map of an urban slum in New Delhi. The medical officer of PHC notified the higher authorities on observing multiple cases of measles in 1-2 years age group. The authorities conducted a survey of vaccine coverage in the area in the manner shown below. Which sampling is depicted here?

Cluster sampling. ***Cluster sampling*** - The image shows **groups (clusters)** of houses (red houses within red circles) being selected, and then all units within those selected groups are included in the sample. - This method is typically used when the population is naturally divided into groups, such as geographical areas or blocks, making it **cost-effective** and practical, especially in large, dispersed populations like an urban slum. *Simple random sampling* - This method would involve **randomly selecting individual houses** from the entire slum without any pre-defined grouping, which is not depicted in the image. - Each house would have an **equal chance of being selected**, and sampling would not be restricted to specific clusters. *Systematic random sampling* - Involves selecting houses at a **fixed interval** (e.g., every 5th house) from a sorted list or along a defined path after a random starting point. - The image does not show a systematic selection process or an underlying order for sampling the houses. *Stratified random sampling* - This method involves **dividing the population into homogeneous subgroups** (strata) based on a characteristic (e.g., age, income level) and then drawing a random sample from each stratum. - While the map shows 'sections', these are not necessarily strata based on a relevant characteristic, and the sampling is not shown to be proportional or disproportional across these sections.

Question 1

In statistics, centiles and quartiles are considered as:

Accepted Answer

Measures of location/position

Answer

Measures of dispersion

Answer

Measures of central tendency

Answer

Measures of correlation

Question 2

In a study done in a hospital, patients were categorized into three groups based on disease prevalence (Low, Medium, High), and individuals were then randomly selected from each group. What type of sampling is this?

Accepted Answer

Stratified random sampling

Answer

Systematic random sampling

Answer

Cluster random sampling

Answer

Simple random sampling

Question 3

A study is conducted to compare the mean hemoglobin (Hb) levels between two independent groups. Which statistical test is most appropriate?

Accepted Answer

Unpaired t-test

Answer

Paired t-test

Answer

Chi-square test

Answer

ANOVA

Question 4

The given image shows:

Accepted Answer

Funnel plot

Answer

Kaplan Meier plot

Answer

Spaghetti plot

Answer

Forest plot

Question 5

The given image shows:

Accepted Answer

Funnel plot

Answer

Kaplan Meier plot

Answer

Spaghetti plot

Answer

Forest plot

Question 6

Which type of statistical graph is shown in the image below?

Accepted Answer

Frequency polygon

Answer

Histogram

Answer

Sector diagram

Answer

Scatter diagram

Question 7

The following scatter plot of 4 different samples shows the correlation between weight and height in the samples. All 4 samples have the same coefficient of correlation of 0.6 taken together, what will be the net correlation coefficient?

Accepted Answer

More than 0.6

Answer

Less than 0.6

Answer

Equal to 0.6

Answer

Cannot be calculated

Question 8

The following diagram shows:

Accepted Answer

Disease Triangle

Answer

Epidemiological Triad

Answer

Web of Causation

Answer

Venn Diagram

Question 9

The person on the upper part of the image is an IV drug abuser who is undergoing rehabilitation. The people shown are either sexual contacts or person with whom he had shared a needle. From those contacts further information leads to multiple such drug addicts in the community who had not sought medical services till now. This is an example of:

Accepted Answer

Snowball sampling

Answer

Cluster sampling

Answer

Stratified cluster sampling

Answer

Convenience sampling

Question 10

This is a map of an urban slum in New Delhi. The medical officer of PHC notified the higher authorities on observing multiple cases of measles in 1-2 years age group. The authorities conducted a survey of vaccine coverage in the area in the manner shown below. Which sampling is depicted here?

Accepted Answer

Cluster sampling

Answer

Simple random sampling

Answer

Systematic random sampling

Answer

Stratified random sampling

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?