12.2 The Chi-Square Test for Independence
The chi-square test for independence examines our observed data and tells us whether we have enough evidence to conclude beyond a reasonable doubt that two categorical variables are related. Much like the previous part on the ANOVA F-test, we are going to introduce the hypotheses (step 1), and then discuss the idea behind the test, which will naturally lead to the test statistic (step 2). Let’s start.
Step 1: Stating the hypotheses
Unlike all the previous tests that we presented, the null and alternative hypotheses in the chi-square test are stated in words rather than in terms of population parameters. They are:
\(H_0:\) There is no relationship between the two categorical variables. (They are independent.)
\(H_a:\) There is a relationship between the two categorical variables. (They are not independent.)
EXAMPLE
In our example, the null and alternative hypotheses would then state:
\(H_0:\) There is no relationship between gender and drunk driving.
\(H_a:\) There is a relationship between gender and drunk driving.
Or equivalently,
\(H_0:\) Drunk driving and gender are independent
\(H_a:\) Drunk driving and gender are not independent
and hence the name “chi-square test for independence.”