Chi-Square Test of Independence

Chi-Square Test of Independence

Definition

Core Statement

The Chi-Square Test of Independence (χ2) determines if there is a statistically significant association between two categorical variables. It compares observed frequencies to expected frequencies under the assumption of independence.


Purpose

  1. Test if two categorical variables are related (e.g., Gender and Product Preference).
  2. Analyze contingency tables (cross-tabulations).

When to Use

Use Chi-Square When...

  • Both variables are categorical (nominal or ordinal).
  • Data is in the form of frequency counts.
  • Expected counts are 5 in at least 80% of cells.

Alternatives

  • Expected counts < 5: Use Fisher's Exact Test.
  • Ordinal data with direction: Consider Cochran-Armitage trend test.


Theoretical Background

Hypotheses

The Chi-Square Statistic

χ2=(OiEi)2Ei

where:

Degrees of Freedom: df=(r1)(c1) for an r×c table.

Logic

If observed counts are far from expected (independent) counts, χ2 is large, and we reject H0.


Assumptions


Limitations

Pitfalls

  1. Sensitive to Sample Size: With very large n, even trivial associations become significant. Report Cramer's V.
  2. Does not measure strength. Chi-square tells you if there's an association, not how strong. Calculate Cramer's V (V=χ2/(nmin(r1,c1))).
  3. Directionless: Does not indicate which categories drive the association.


Python Implementation

from scipy.stats import chi2_contingency
import pandas as pd

# Contingency Table
#           Product A  Product B
# Male         30         10
# Female       20         40
table = [[30, 10], [20, 40]]

chi2, p, dof, expected = chi2_contingency(table)

print(f"Chi-Square: {chi2:.2f}")
print(f"p-value: {p:.4f}")
print(f"Degrees of Freedom: {dof}")
print(f"Expected Counts:\n{expected}")

# Effect Size: Cramer's V
import numpy as np
n = np.sum(table)
min_dim = min(len(table), len(table[0])) - 1
cramers_v = np.sqrt(chi2 / (n * min_dim))
print(f"Cramer's V: {cramers_v:.3f}")

R Implementation

# Create Table
tbl <- matrix(c(30, 10, 20, 40), nrow = 2, byrow = TRUE)
rownames(tbl) <- c("Male", "Female")
colnames(tbl) <- c("Product A", "Product B")

# Chi-Square Test
result <- chisq.test(tbl)
print(result)

# Check Expected Counts (Assumption)
print(result$expected)

# Effect Size: Cramer's V
library(vcd)
assocstats(tbl)

Worked Numerical Example

A/B Testing: Button Color vs Clicks

Data:

  • Red Button: 50 Clicks, 950 No Clicks (Total 1000) -> 5% CTR
  • Green Button: 80 Clicks, 920 No Clicks (Total 1000) -> 8% CTR

Contingency Table:

Click No
Red 50 950
Green 80 920

Results:

  • χ2 = 7.15, p = 0.007.
  • Conclusion: Green button significantly outperforms Red.
  • Cramer's V: 0.06 (Effect is statistically significant, but weak strength).

Interpretation Guide

Output Interpretation Edge Case Notes
p < 0.05 Association exists. Does not say how they are associated. Look at counts!
p = 0.00001 Significant evidence. Caution: With N=1,000,000, tiny differences become "significant". Check V.
Cramer's V = 0.1 Weak association.
Cramer's V = 0.6 Strong association.
Warning "Approximation incorrect" Expected counts < 5 detected. χ2 invalid. Switch to Fisher's Exact Test.

Common Pitfall Example

Large Sample Size Trap

Scenario: Analyzing huge dataset (N = 50,000).
Variables: Gender (M/F) vs Preferred Pet (Cat/Dog).

Result:

  • Males: 50.1% Dog
  • Females: 49.9% Dog
  • χ2 test might return p<0.05.

Correction:

  • Yes, there is a "statistical" difference.
  • But: The difference (0.2%) is practically meaningless.
  • Always interpret EFFECT SIZE (Cramer's V) alongside p-value for large samples.