Chi-Squared Test for Independence
Overview:
The chi-squared test of independence tests if two or more subsets of the population are independent with respect to a single categorical variable. Specifically we compare the distributions of that variable within each subset and test the likeliness of the distributions being the same.
Hypothesis
The null hypothesis states that a single population has two variables that are independent of each other. While the alternative hypothesis states that they are not independent. The hypotheses are written in words, not symbols.
Assumptions and Conditions
Mechanics
χ²: Chi-squared test statistic is when you add the sum of the squares of the deviation between the observed and expected counts divided by the expected counts.
The chi-squared test of independence tests if two or more subsets of the population are independent with respect to a single categorical variable. Specifically we compare the distributions of that variable within each subset and test the likeliness of the distributions being the same.
Hypothesis
The null hypothesis states that a single population has two variables that are independent of each other. While the alternative hypothesis states that they are not independent. The hypotheses are written in words, not symbols.
Assumptions and Conditions
- Randomization Condition: The data must be chosen randomly.
- 10% Condition: The sample size, n, must be no larger than 10% of the population.
- Expected Cell Frequency Condition: We must have at least 5 individuals or entries in each cell.
Mechanics
χ²: Chi-squared test statistic is when you add the sum of the squares of the deviation between the observed and expected counts divided by the expected counts.
Degrees of Freedom: n is the number of categories
(R-1)(C-1) R = Row C = Column
*Degrees of Freedom is needed to find a P-value for the chi-square statistic.
TI Tip
2nd › Matrix › Edit › Type the Dimensions and the observed counts into Matrix A.
Stat › Tests › C: χ²-Test
Standardized Residual: If you reject the null hypothesis it is always good to check the residual.
(R-1)(C-1) R = Row C = Column
*Degrees of Freedom is needed to find a P-value for the chi-square statistic.
TI Tip
2nd › Matrix › Edit › Type the Dimensions and the observed counts into Matrix A.
Stat › Tests › C: χ²-Test
Standardized Residual: If you reject the null hypothesis it is always good to check the residual.
We can see Matrix B (with expected cell counts) to check the Expected Cell Frequency Condition.
Conclusion
In your conclusion you would either reject or fail to reject your null hypothesis. If the p-value is higher than the alpha level (0.05) then you would fail to reject the null hypothesis and there is not enough evidence to support that they have the same distribution. If the p-value is lower than the alpha level, then you would reject the null and there is enough evidence to support it.
Example:
Conclusion
In your conclusion you would either reject or fail to reject your null hypothesis. If the p-value is higher than the alpha level (0.05) then you would fail to reject the null hypothesis and there is not enough evidence to support that they have the same distribution. If the p-value is lower than the alpha level, then you would reject the null and there is enough evidence to support it.
Example:
a) To find the degrees of freedom we would use the equation
shown above: (2-1)(2-1) = 1
b) Type these values in a 2x2 matrix and go to stat > tests > x2 – test. You only need Matrix [A]. Matrix [B] will be automatically be imputed. The expected count for epideral/no breastfeeding cell is 159.34.
c) There are fewer than 10% of all babies in this sample and using Matrix [B] no counts are under 5.
b) Type these values in a 2x2 matrix and go to stat > tests > x2 – test. You only need Matrix [A]. Matrix [B] will be automatically be imputed. The expected count for epideral/no breastfeeding cell is 159.34.
c) There are fewer than 10% of all babies in this sample and using Matrix [B] no counts are under 5.