Biostatistics Practice Questions

Q: In a standard normal distribution curve, what percentage of the area under the curve lies between the mean and one standard deviation from the mean?

34%. ***34%*** - In a **standard normal distribution**, approximately 34.1% of the data falls between the **mean** and one **standard deviation** above the mean, and similarly, 34.1% falls between the mean and one standard deviation below the mean. - This is a fundamental property derived from the **empirical rule (68-95-99.7 rule)**, where 68% of the data lies within one standard deviation of the mean (34% on each side). *15%* - This percentage is too low and does not align with the properties of a **standard normal distribution** regarding the area between the mean and one standard deviation. - While 15.85% of data falls *beyond* one standard deviation above or below the mean, it's not the area *between* the mean and one standard deviation. *68%* - This value represents the total area under the curve that lies within **one standard deviation** *of the mean* (i.e., from -1 SD to +1 SD from the mean). - It is the sum of the areas between the mean and +1 SD, and between the mean and -1 SD, which is 34% + 34% = 68%. The question specifically asks for the area between the mean and *one* standard deviation (i.e., on one side). *95%* - This value represents the total area under the curve that lies within **two standard deviations** *of the mean* (i.e., from -2 SD to +2 SD from the mean). - According to the **empirical rule**, approximately 95% of data falls within two standard deviations of the mean.

Q: What is the numerator in the formula for calculating Negative Predictive Value (NPV) in diagnostic testing?

True negative. ***True negative*** - In the calculation of **Negative Predictive Value (NPV)**, the numerator represents the number of individuals who are truly disease-free and also test negative for the disease. - NPV answers the question: "If a patient tests negative, what is the probability that they are actually **disease-free**?" *True positive* - **True positives** are individuals who have the disease and also test positive; they are the numerator for **Positive Predictive Value (PPV)**. - They do not factor into the numerator for NPV, which focuses on negative test results and the absence of disease. *False positive* - **False positives** are individuals who do not have the disease but test positive; they are found in the denominator for PPV, but not in the numerator for NPV. - They represent an incorrect test result and do not contribute to the count of truly healthy individuals with a negative test. *False negative* - **False negatives** are individuals who have the disease but test negative; they are in the denominator for **sensitivity** and NPV. - They represent a missed diagnosis and are not part of the numerator for NPV, which specifically identifies correctly identified healthy individuals.

Q: Which type of chart is best to represent the following data? Year: 1991, 1992, 1993, 1994; Number of LBW babies: 125, 50, 25, 75.

Bar chart. **Bar chart** - A **bar chart** is the most appropriate for representing categorical data or discrete numerical data over a period. - Each year (1991, 1992, 1993, 1994) represents a distinct category, and the number of LBW babies is the quantitative value associated with each year. *Histogram* - A **histogram** is used to represent the distribution of continuous numerical data, grouped into bins, to show frequencies. - The data provided (years and counts) is discrete, not continuous. *Frequency polygon* - A **frequency polygon** is used to display the shape of distribution for a continuous variable, often by connecting the midpoints of the tops of the bars in a histogram. - It is not suitable for discrete yearly data, as there are no continuous intervals to connect. *Scatter diagram* - A **scatter diagram** is used to show the relationship or correlation between two continuous numerical variables. - While one variable is numerical (number of LBW babies), the other (year) is categorical or ordinal, and the primary purpose here is to show change over time, not a correlation between two continuous variables.

Q: The variation in data is compared with another data set by:

Coefficient of variation. ***Coefficient of variation*** - The **coefficient of variation (CV)** is a standardized measure of dispersion of a probability distribution or frequency distribution. - It expresses the **standard deviation** as a percentage of the **mean**, making it useful for comparing the variability of two independent data sets with different units or widely different means. *Standard Error of Mean* - The **Standard Error of the Mean (SEM)** is used to estimate the variability between sample means if multiple samples were taken from the same population. - It primarily quantifies the accuracy with which a sample mean represents a population mean, not for comparing variations between different data sets. *Standard Deviation* - **Standard deviation (SD)** measures the amount of variation or dispersion of a set of values *within a single data set*. - While it quantifies variability, it is not ideal for directly comparing the variability of two data sets with different units or means because it isn't normalized. *Variance* - **Variance** measures how far each number in the set is from the mean; it is the **average of the squared differences** from the mean. - Like standard deviation, variance describes the spread within a single dataset and is not normalized for direct comparison between datasets with different scales.

Q: In a study assessing malnutrition among young children, 100 children were selected from rural and urban areas (50 from each area). Out of these, 30 children from rural areas and 20 children from urban areas were found to be malnourished. Which statistical test is appropriate for comparing the proportions of malnourished children between the two groups?

Chi-square. ***Chi-square*** - The **chi-square test** is used to compare proportions or frequencies between two or more categorical groups. Here, we are comparing the proportion of malnourished children (a categorical outcome) between two different living areas (rural vs. urban, also categorical). - This test determines if there is a statistically significant association between the two categorical variables. *Paired t-test* - A **paired t-test** is used to compare the means of two related groups or samples, such as measurements taken before and after an intervention on the same individuals. - This scenario involves comparing independent groups (rural vs. urban children) and proportions, not means from paired samples. *The standard error of mean* - The **standard error of the mean (SEM)** is a measure of the statistical accuracy of an estimate; specifically, it's the standard deviation of the sample mean's distribution. - It is used to quantify the variability of sample means, not to perform a comparative hypothesis test between two groups. *ANOVA* - **ANOVA (Analysis of Variance)** is used to compare the means of **three or more independent groups**. While it compares means, it is not appropriate for comparing proportions between just two groups. - If we were comparing the mean weight of children across three or more living areas, ANOVA would be suitable, but not for comparing proportions between two groups.

Q: In a normal distribution with mean = 200 and standard deviation = 20, what is the range in which 68% of the values will fall?

180-220. ***180-220*** - In a **normal distribution**, approximately 68% of the data falls within **one standard deviation** of the mean. - With a mean of 200 and a standard deviation of 20, this range is calculated as 200 ± 20, which equals **180-220**. *160-240* - This range represents the values falling within **two standard deviations** from the mean (200 ± 2*20 = 160-240). - Approximately **95%** of the values in a normal distribution fall within this range, not 68%. *170-230* - This range does not correspond to a standard integer multiple of the standard deviation from the mean (200 ± 1.5*20 = 170-230). - It does not represent a standard percentage of values in a normal distribution like 68%, 95%, or 99.7%. *190-210* - This range represents half of one standard deviation from the mean (200 ± 0.5*20 = 190-210). - This range covers a smaller percentage of values than 68%, typically around **38%**.

Q: In the context of statistical analysis, if you have the value of one variable, which coefficient would you use to predict the value of another variable?

Coefficient of regression. ***Coefficient of regression*** - The **coefficient of regression** (or **regression coefficient**) is fundamental in **regression analysis**, which is specifically designed to predict the value of a **dependent variable** based on the value of one or more **independent variables**. - It quantifies the expected change in the dependent variable for a unit change in the independent variable. *Coefficient of variation* - The **coefficient of variation** is a measure of **relative variability** or dispersion, expressing the standard deviation as a percentage of the mean. - It describes the extent of variation in relation to the mean but does not provide a basis for predicting one variable from another. *Coefficient of correlation* - The **coefficient of correlation** measures the **strength and direction of a linear relationship** between two variables. - While it indicates how well two variables move together, it does not directly enable the prediction of one variable's value from another; that is the role of regression. *Coefficient of determination* - The **coefficient of determination (R²)** represents the **proportion of the variance** in the dependent variable that can be explained by the independent variable(s) in a regression model. - It quantifies how well the regression model fits the observed data, but it is not used directly for prediction; rather it is for assessing the predictive power of the model.

Q: Which of the following is the MOST important vital statistic in a population?

Mortality rate. ***Mortality rate*** - The **mortality rate** directly reflects the health status and overall well-being of a population by indicating the number of deaths per unit population. - A high mortality rate signals underlying public health issues, inadequate healthcare, or poor living conditions, making it the **most critical vital statistic** for assessing population health and guiding interventions. - It serves as a **key indicator** for comparing health status across populations and time periods. *Fertility rate* - The **fertility rate** measures the average number of children born to women of reproductive age, influencing future population size and age structure. - While important for demographic planning and population projections, it doesn't directly provide insights into the immediate health challenges or mortality burden of a population. *Morbidity rate* - The **morbidity rate** quantifies the incidence or prevalence of disease in a population, reflecting the disease burden. - Although crucial for understanding health problems and planning healthcare services, it is considered secondary to mortality as a vital statistic since mortality represents the ultimate health outcome. *Birth rate* - The **birth rate** quantifies the number of live births per 1,000 people in a year, contributing to population growth and demographic trends. - Like the fertility rate, it is essential for understanding natality patterns but offers less insight into the overall health status and survival of a population compared to the mortality rate.

Q: In a clinical study examining the relationship between weight and height in pediatric patients, what is the maximum possible value of the correlation coefficient if the correlation is very strong?

+1. ***+1 (perfect positive correlation)*** - A correlation coefficient of **+1** indicates a perfect positive linear relationship between two variables, meaning as one variable increases, the other increases proportionally. - This value represents the **maximum possible strength** for a positive correlation. *0* - A correlation coefficient of **0** indicates no linear relationship between two variables. - This would contradict the premise that the correlation is "very strong". *+2 (invalid value for correlation coefficient)* - The correlation coefficient, also known as Pearson's r, can only range from **-1 to +1**. - A value of +2 is outside this possible range and is therefore an **invalid value**. *No correlation (not possible for strong correlation)* - **No correlation** implies a correlation coefficient of 0 or close to 0. - This directly contradicts the statement that there is a **very strong correlation** between weight and height.

Question 1

In a standard normal distribution curve, what percentage of the area under the curve lies between the mean and one standard deviation from the mean?

Accepted Answer

34%

Answer

68%

Answer

15%

Answer

95%

Question 2

What is the numerator in the formula for calculating Negative Predictive Value (NPV) in diagnostic testing?

Accepted Answer

True negative

Answer

True positive

Answer

False positive

Answer

False negative

Question 3

Which type of chart is best to represent the following data? Year: 1991, 1992, 1993, 1994; Number of LBW babies: 125, 50, 25, 75.

Accepted Answer

Bar chart

Answer

Histogram

Answer

Scatter diagram

Answer

Frequency polygon

Question 4

Which of the following best describes the Paired T test?

Accepted Answer

Test used to assess quantitative observations before and after an intervention.

Answer

Test used for categorical data.

Answer

Test applied to compare means of two independent groups.

Answer

None of the options.

Question 5

The variation in data is compared with another data set by:

Accepted Answer

Coefficient of variation

Answer

Standard Deviation

Answer

Standard Error of Mean

Answer

Variance

Question 6

In a study assessing malnutrition among young children, 100 children were selected from rural and urban areas (50 from each area). Out of these, 30 children from rural areas and 20 children from urban areas were found to be malnourished. Which statistical test is appropriate for comparing the proportions of malnourished children between the two groups?

Accepted Answer

Chi-square

Answer

Paired t-test

Answer

The standard error of Mean

Answer

ANOVA

Question 7

In a normal distribution with mean = 200 and standard deviation = 20, what is the range in which 68% of the values will fall?

Accepted Answer

180-220

Answer

160-240

Answer

170-230

Answer

190-210

Question 8

In the context of statistical analysis, if you have the value of one variable, which coefficient would you use to predict the value of another variable?

Accepted Answer

Coefficient of regression

Answer

Coefficient of variation

Answer

Coefficient of correlation

Answer

Coefficient of determination

Question 9

Which of the following is the MOST important vital statistic in a population?

Accepted Answer

Mortality rate

Answer

Fertility rate

Answer

Morbidity rate

Answer

Birth rate

Question 10

In a clinical study examining the relationship between weight and height in pediatric patients, what is the maximum possible value of the correlation coefficient if the correlation is very strong?

Accepted Answer

+1

Answer

0

Answer

+2

Answer

No correlation

Biostatistics — MCQs

Biostatistics — MCQs

On this page

Practice by Chapter

Want unlimited practice?