Spearman's Rank Correlation

Definition

Core Statement

Spearman's Rank Correlation ( $ρ$ or $r_{s}$ ) measures the strength and direction of the monotonic relationship between two variables. Unlike Pearson, it operates on ranks rather than raw values, making it robust to outliers and applicable to ordinal data.

Purpose

Measure association when the relationship is monotonic but not necessarily linear.
Analyze ordinal data (e.g., rankings, Likert scales).
Provide a robust alternative to Pearson when outliers are present.

When to Use

Use Spearman When...

Data is ordinal.
The relationship is monotonic (always increasing or always decreasing), but not necessarily linear.
Outliers are present.
Normality is not met.

Monotonic vs Linear

Linear: $Y = a X + b$ (straight line).
Monotonic: $Y$ increases as $X$ increases (curve OK). E.g., $Y = X^{2}$ for $X > 0$ .

Theoretical Background

Calculation

Rank all $X$ values (1 = smallest). Rank all $Y$ values.
Calculate Pearson correlation on the ranks.

ρ = 1 - \frac{6 \sum d_{i}^{2}}{n (n^{2} - 1)}

where $d_{i}$ = difference between ranks of $X_{i}$ and $Y_{i}$ .

Interpretation

Same as Pearson: ranges from -1 to +1.

Assumptions

Ordinal or Continuous Data.
Monotonic Relationship: As $X$ increases, $Y$ consistently increases (or decreases).
Independence.

Limitations

Pitfalls

Ties reduce precision: Many tied values can distort $ρ$ .
Does not capture non-monotonic relationships: If the relationship changes direction (e.g., U-shaped), Spearman fails.

Python Implementation

from scipy import stats

rho, p_val = stats.spearmanr(x, y)

print(f"Spearman rho: {rho:.3f}")
print(f"p-value: {p_val:.4f}")

R Implementation

cor.test(x, y, method = "spearman")

Worked Numerical Example

Contest Rankings

Data: 5 Participants.

Judge A Ranks: [1, 2, 3, 4, 5]
Judge B Ranks: [1, 3, 2, 5, 4]

Differences ( $d$ ):

$1 - 1 = 0, 2 - 3 = - 1, 3 - 2 = 1, 4 - 5 = - 1, 5 - 4 = 1$
Squared diffs ( $d^{2}$ ): $0, 1, 1, 1, 1$ . Sum = 4.

Calculation:

$ρ = 1 - \frac{6 \times 4}{5 (25 - 1)} = 1 - \frac{24}{120} = 1 - 0.2 = 0.8$ .

Interpretation: Strong positive agreement ( $ρ = 0.8$ ) between the two judges.

Interpretation Guide

Scenario	Interpretation	Edge Case Notes
$ρ = 0.9$	Strong positive monotonic relationship.	X increases $\to$ Y increases.
$ρ = - 0.6$	Moderate negative monotonic relationship.	X increases $\to$ Y decreases.
Pearson $r = 0.4$ , Spearman $ρ = 0.75$	Relationship is monotonic but non-linear (e.g., exponential).	Spearman is better metric here.
$ρ = 0$	No monotonic relationship.	Could still be non-monotonic (U-shape).

Common Pitfall Example

The "Ties" Trap

Scenario: Analyzing Customer Satisfaction (1-5 scale).
Data: Thousands of customers, only 5 possible values (many ties).

Problem:

The standard formula $ρ = 1 - \frac{6 \sum d^{2}}{n (n^{2} - 1)}$ assumes no ties.
With heavy ties, this formula is inaccurate.

Solution:

Use software (Python/R) which automatically uses the complicated "tie-corrected" formula.
Do not calculate manually using the simplified formula for Likert scale data.

Pearson Correlation - Parametric, linear.
Kendall's Tau - Alternative for small samples.

Spearman's Rank Correlation

Definition

Purpose

When to Use

Theoretical Background

Calculation

Interpretation

Assumptions

Limitations

Python Implementation

R Implementation

Worked Numerical Example

Interpretation Guide

Common Pitfall Example

Related Concepts