In a standard normal distribution curve, what percentage of the area under the curve lies between the mean and one standard deviation from the mean?
What is the numerator in the formula for calculating Negative Predictive Value (NPV) in diagnostic testing?
Which type of chart is best to represent the following data? Year: 1991, 1992, 1993, 1994; Number of LBW babies: 125, 50, 25, 75.
In a normal distribution with mean = 200 and standard deviation = 20, what is the range in which 68% of the values will fall?
In the context of statistical analysis, if you have the value of one variable, which coefficient would you use to predict the value of another variable?
In a study assessing malnutrition among young children, 100 children were selected from rural and urban areas (50 from each area). Out of these, 30 children from rural areas and 20 children from urban areas were found to be malnourished. Which statistical test is appropriate for comparing the proportions of malnourished children between the two groups?
The variation in data is compared with another data set by:
Which of the following best describes the Paired T test?
Which of the following is the MOST important vital statistic in a population?
The population is divided into homogeneous subgroups, and then individuals are randomly selected from each subgroup. What type of sampling is this?
Explanation: ***34%*** - In a **standard normal distribution**, approximately 34.1% of the data falls between the **mean** and one **standard deviation** above the mean, and similarly, 34.1% falls between the mean and one standard deviation below the mean. - This is a fundamental property derived from the **empirical rule (68-95-99.7 rule)**, where 68% of the data lies within one standard deviation of the mean (34% on each side). *15%* - This percentage is too low and does not align with the properties of a **standard normal distribution** regarding the area between the mean and one standard deviation. - While 15.85% of data falls *beyond* one standard deviation above or below the mean, it's not the area *between* the mean and one standard deviation. *68%* - This value represents the total area under the curve that lies within **one standard deviation** *of the mean* (i.e., from -1 SD to +1 SD from the mean). - It is the sum of the areas between the mean and +1 SD, and between the mean and -1 SD, which is 34% + 34% = 68%. The question specifically asks for the area between the mean and *one* standard deviation (i.e., on one side). *95%* - This value represents the total area under the curve that lies within **two standard deviations** *of the mean* (i.e., from -2 SD to +2 SD from the mean). - According to the **empirical rule**, approximately 95% of data falls within two standard deviations of the mean.
Explanation: ***True negative*** - In the calculation of **Negative Predictive Value (NPV)**, the numerator represents the number of individuals who are truly disease-free and also test negative for the disease. - NPV answers the question: "If a patient tests negative, what is the probability that they are actually **disease-free**?" *True positive* - **True positives** are individuals who have the disease and also test positive; they are the numerator for **Positive Predictive Value (PPV)**. - They do not factor into the numerator for NPV, which focuses on negative test results and the absence of disease. *False positive* - **False positives** are individuals who do not have the disease but test positive; they are found in the denominator for PPV, but not in the numerator for NPV. - They represent an incorrect test result and do not contribute to the count of truly healthy individuals with a negative test. *False negative* - **False negatives** are individuals who have the disease but test negative; they are in the denominator for **sensitivity** and NPV. - They represent a missed diagnosis and are not part of the numerator for NPV, which specifically identifies correctly identified healthy individuals.
Explanation: **Bar chart** - A **bar chart** is the most appropriate for representing categorical data or discrete numerical data over a period. - Each year (1991, 1992, 1993, 1994) represents a distinct category, and the number of LBW babies is the quantitative value associated with each year. *Histogram* - A **histogram** is used to represent the distribution of continuous numerical data, grouped into bins, to show frequencies. - The data provided (years and counts) is discrete, not continuous. *Frequency polygon* - A **frequency polygon** is used to display the shape of distribution for a continuous variable, often by connecting the midpoints of the tops of the bars in a histogram. - It is not suitable for discrete yearly data, as there are no continuous intervals to connect. *Scatter diagram* - A **scatter diagram** is used to show the relationship or correlation between two continuous numerical variables. - While one variable is numerical (number of LBW babies), the other (year) is categorical or ordinal, and the primary purpose here is to show change over time, not a correlation between two continuous variables.
Explanation: ***180-220*** - In a **normal distribution**, approximately 68% of the data falls within **one standard deviation** of the mean. - With a mean of 200 and a standard deviation of 20, this range is calculated as 200 ± 20, which equals **180-220**. *160-240* - This range represents the values falling within **two standard deviations** from the mean (200 ± 2*20 = 160-240). - Approximately **95%** of the values in a normal distribution fall within this range, not 68%. *170-230* - This range does not correspond to a standard integer multiple of the standard deviation from the mean (200 ± 1.5*20 = 170-230). - It does not represent a standard percentage of values in a normal distribution like 68%, 95%, or 99.7%. *190-210* - This range represents half of one standard deviation from the mean (200 ± 0.5*20 = 190-210). - This range covers a smaller percentage of values than 68%, typically around **38%**.
Explanation: ***Coefficient of regression*** - The **coefficient of regression** (or **regression coefficient**) is fundamental in **regression analysis**, which is specifically designed to predict the value of a **dependent variable** based on the value of one or more **independent variables**. - It quantifies the expected change in the dependent variable for a unit change in the independent variable. *Coefficient of variation* - The **coefficient of variation** is a measure of **relative variability** or dispersion, expressing the standard deviation as a percentage of the mean. - It describes the extent of variation in relation to the mean but does not provide a basis for predicting one variable from another. *Coefficient of correlation* - The **coefficient of correlation** measures the **strength and direction of a linear relationship** between two variables. - While it indicates how well two variables move together, it does not directly enable the prediction of one variable's value from another; that is the role of regression. *Coefficient of determination* - The **coefficient of determination (R²)** represents the **proportion of the variance** in the dependent variable that can be explained by the independent variable(s) in a regression model. - It quantifies how well the regression model fits the observed data, but it is not used directly for prediction; rather it is for assessing the predictive power of the model.
Explanation: ***Chi-square*** - The **chi-square test** is used to compare proportions or frequencies between two or more categorical groups. Here, we are comparing the proportion of malnourished children (a categorical outcome) between two different living areas (rural vs. urban, also categorical). - This test determines if there is a statistically significant association between the two categorical variables. *Paired t-test* - A **paired t-test** is used to compare the means of two related groups or samples, such as measurements taken before and after an intervention on the same individuals. - This scenario involves comparing independent groups (rural vs. urban children) and proportions, not means from paired samples. *The standard error of mean* - The **standard error of the mean (SEM)** is a measure of the statistical accuracy of an estimate; specifically, it's the standard deviation of the sample mean's distribution. - It is used to quantify the variability of sample means, not to perform a comparative hypothesis test between two groups. *ANOVA* - **ANOVA (Analysis of Variance)** is used to compare the means of **three or more independent groups**. While it compares means, it is not appropriate for comparing proportions between just two groups. - If we were comparing the mean weight of children across three or more living areas, ANOVA would be suitable, but not for comparing proportions between two groups.
Explanation: ***Coefficient of variation*** - The **coefficient of variation (CV)** is a standardized measure of dispersion of a probability distribution or frequency distribution. - It expresses the **standard deviation** as a percentage of the **mean**, making it useful for comparing the variability of two independent data sets with different units or widely different means. *Standard Error of Mean* - The **Standard Error of the Mean (SEM)** is used to estimate the variability between sample means if multiple samples were taken from the same population. - It primarily quantifies the accuracy with which a sample mean represents a population mean, not for comparing variations between different data sets. *Standard Deviation* - **Standard deviation (SD)** measures the amount of variation or dispersion of a set of values *within a single data set*. - While it quantifies variability, it is not ideal for directly comparing the variability of two data sets with different units or means because it isn't normalized. *Variance* - **Variance** measures how far each number in the set is from the mean; it is the **average of the squared differences** from the mean. - Like standard deviation, variance describes the spread within a single dataset and is not normalized for direct comparison between datasets with different scales.
Explanation: ***Test used to assess quantitative observations before and after an intervention.*** - The **Paired T test** is specifically designed to compare **means** of two related groups or measurements from the same subjects under two different conditions, for example, before and after an intervention. - This test is appropriate when the data are **quantitative** and the observations are dependent, allowing for the analysis of individual changes. *Test used for categorical data.* - Tests for **categorical data** typically include **Chi-square tests** or **Fisher's exact tests**, which analyze frequencies and associations between categories, not means of quantitative data. - The Paired T test requires **numerical, quantitative data** that can be averaged. *Test applied to compare means of two independent groups.* - Comparing means of **two independent groups** is typically done using an **Independent Samples T test** (also known as a Two-Sample T test), not a Paired T test. - An **Independent Samples T test** assumes that the observations in each group are unrelated to each other. *None of the options.* - The correct description for the Paired T test is provided in one of the other options, making this statement incorrect.
Explanation: ***Mortality rate*** - The **mortality rate** directly reflects the health status and overall well-being of a population by indicating the number of deaths per unit population. - A high mortality rate signals underlying public health issues, inadequate healthcare, or poor living conditions, making it the **most critical vital statistic** for assessing population health and guiding interventions. - It serves as a **key indicator** for comparing health status across populations and time periods. *Fertility rate* - The **fertility rate** measures the average number of children born to women of reproductive age, influencing future population size and age structure. - While important for demographic planning and population projections, it doesn't directly provide insights into the immediate health challenges or mortality burden of a population. *Morbidity rate* - The **morbidity rate** quantifies the incidence or prevalence of disease in a population, reflecting the disease burden. - Although crucial for understanding health problems and planning healthcare services, it is considered secondary to mortality as a vital statistic since mortality represents the ultimate health outcome. *Birth rate* - The **birth rate** quantifies the number of live births per 1,000 people in a year, contributing to population growth and demographic trends. - Like the fertility rate, it is essential for understanding natality patterns but offers less insight into the overall health status and survival of a population compared to the mortality rate.
Explanation: ***Stratified random*** - In **stratified random sampling**, the population is first divided into homogeneous subgroups (strata), and then a simple random sample is drawn from each stratum. - This method ensures representation from all subgroups, which is implied by the description "separated into groups, from each group people are selected randomly." *Simple random* - **Simple random sampling** involves selecting individuals from an entire population purely by chance, where each individual has an equal probability of being chosen. - This method does not involve an initial division of the population into distinct groups before selection. *Systematic random* - **Systematic random sampling** involves selecting every nth individual from a list after a random starting point. - This method does not involve dividing the population into groups and then sampling from each group. *Cluster* - **Cluster sampling** involves dividing the population into clusters (usually naturally occurring groups), randomly selecting a few clusters, and then sampling *all* individuals within the selected clusters. - In cluster sampling, individuals are not randomly selected *from each* group; instead, entire groups are selected.
Collection and Presentation of Data
Practice Questions
Measures of Central Tendency
Practice Questions
Measures of Dispersion
Practice Questions
Normal Distribution
Practice Questions
Sampling Methods
Practice Questions
Sample Size Calculation
Practice Questions
Hypothesis Testing
Practice Questions
Tests of Significance
Practice Questions
Correlation and Regression
Practice Questions
Survival Analysis
Practice Questions
Multivariate Analysis
Practice Questions
Statistical Software in Research
Practice Questions
Get full access to all questions, explanations, and performance tracking.
Start For Free