Kruskal-Wallis Test

Definition

Core Statement

The Kruskal-Wallis H Test is the non-parametric alternative to One-Way ANOVA. It compares the rank distributions of three or more independent groups to determine if at least one group differs.

Purpose

Test for differences among 3+ groups when data is ordinal or non-normal.
Extend Mann-Whitney to multiple groups.

When to Use

Use Kruskal-Wallis When...

Outcome is ordinal or non-normal continuous.
There are three or more independent groups.
ANOVA assumptions (normality, equal variance) are violated.

Limitations

Like ANOVA, a significant result only tells you a difference exists, not which groups differ.
Requires post-hoc tests (Dunn's Test) for pairwise comparisons.

Theoretical Background

The H Statistic

H = \frac{12}{N (N + 1)} \sum_{j = 1}^{k} \frac{R_{j}^{2}}{n_{j}} - 3 (N + 1)

where $R_{j}$ is the sum of ranks in group $j$ , $n_{j}$ is the sample size of group $j$ , and $N$ is total sample size.

Under $H_{0}$ , $H$ follows a chi-squared distribution with $k - 1$ degrees of freedom.

Worked Example: Pain Relief Study

Problem

Comparing 3 drugs for pain relief (Scale 1-10, Ordinal).

Drug A: [2, 3, 3, 4] (Low pain)
Drug B: [5, 6, 5, 7] (Medium pain)
Drug C: [8, 9, 8, 10] (High pain)

Question: Is there a difference in effectiveness?

Solution:

Rank all data (N=12):
- A: [1, 2.5, 2.5, 4] $\to \sum R_{A} = 10$ .
- B: [5.5, 7, 5.5, 8] $\to \sum R_{B} = 26$ .
- C: [9.5, 11, 9.5, 12] $\to \sum R_{C} = 42$ .
Calculate H Statistic:
$H = \frac{12}{12 (13)} (\frac{10^{2}}{4} + \frac{26^{2}}{4} + \frac{42^{2}}{4}) - 3 (13)$ $H = \frac{1}{13} (25 + 169 + 441) - 39$ $H = \frac{635}{13} - 39 \approx 48.84 - 39 = 9.84$
Result:
- $d f = k - 1 = 2$ . Critical $χ^{2}$ (0.05, 2) = 5.99.
- $9.84 > 5.99$ . Reject $H_{0}$ .
- Conclusion: Drug pain levels differ significantly. (A is best, C is worst).

Assumptions

Independence.
Ordinal or Continuous Data.
Similar Distribution Shapes (tests location shift).

Limitations

Pitfalls

"One-Shot" Fallacy: Reporting a significant Kruskal-Wallis test isn't enough. You must do Dunn's Test to prove A is different from B.
Weak for small samples: With $n = 3$ per group, very hard to find significance.
Shape assumption: If shapes vary widely (one bimodal, one normal), the test is less interpretable as a meaningful comparison.

Python Implementation

from scipy import stats
import scikit_posthocs as sp

group1 = [5, 6, 7, 8]
group2 = [10, 12, 14, 16]
group3 = [20, 22, 24, 26]

# Kruskal-Wallis Test
h_stat, p_val = stats.kruskal(group1, group2, group3)
print(f"H-statistic: {h_stat:.2f}, p-value: {p_val:.4f}")

# Post-Hoc: Dunn's Test (requires scikit-posthocs)
import pandas as pd
data = group1 + group2 + group3
groups = ['G1']*4 + ['G2']*4 + ['G3']*4
dunn = sp.posthoc_dunn([group1, group2, group3], p_adjust='bonferroni')
print(dunn)

R Implementation

# Kruskal-Wallis Test
kruskal.test(Value ~ Group, data = df)

# Post-Hoc: Dunn's Test
library(FSA)
dunnTest(Value ~ Group, data = df, method = "bonferroni")

Interpretation Guide

Output	Interpretation
Output	Interpretation
--------	----------------
H = 9.84, p = 0.007	Reject $H_{0}$ . Generally, ranks are not randomly distributed across groups.
High H Value	Large separation between sums of ranks (Mean Rank A $\neq$ Mean Rank B).
Dunn p-adj < 0.05	Specific pair (e.g., A vs C) is significantly different.
Effect Size ( $η_{H}^{2}$ )	Measure of how much variance is explained by group membership.

One-Way ANOVA - Parametric alternative.
Mann-Whitney U Test - For 2 groups.
Dunn's Test - Post-hoc pairwise comparison.

Kruskal-Wallis Test

Definition

Purpose

When to Use

Theoretical Background

The H Statistic

Worked Example: Pain Relief Study

Assumptions

Limitations

Python Implementation

R Implementation

Interpretation Guide

Related Concepts