Spearman's Rank Correlation

Spearman's Rank Correlation

Definition

Core Statement

Spearman's Rank Correlation (ρ or rs) measures the strength and direction of the monotonic relationship between two variables. Unlike Pearson, it operates on ranks rather than raw values, making it robust to outliers and applicable to ordinal data.


Purpose

  1. Measure association when the relationship is monotonic but not necessarily linear.
  2. Analyze ordinal data (e.g., rankings, Likert scales).
  3. Provide a robust alternative to Pearson when outliers are present.

When to Use

Use Spearman When...

  • Data is ordinal.
  • The relationship is monotonic (always increasing or always decreasing), but not necessarily linear.
  • Outliers are present.
  • Normality is not met.

Monotonic vs Linear

  • Linear: Y=aX+b (straight line).
  • Monotonic: Y increases as X increases (curve OK). E.g., Y=X2 for X>0.


Theoretical Background

Calculation

  1. Rank all X values (1 = smallest). Rank all Y values.
  2. Calculate Pearson correlation on the ranks.
ρ=16di2n(n21)

where di = difference between ranks of Xi and Yi.

Interpretation

Same as Pearson: ranges from -1 to +1.


Assumptions


Limitations

Pitfalls

  1. Ties reduce precision: Many tied values can distort ρ.
  2. Does not capture non-monotonic relationships: If the relationship changes direction (e.g., U-shaped), Spearman fails.


Python Implementation

from scipy import stats

rho, p_val = stats.spearmanr(x, y)

print(f"Spearman rho: {rho:.3f}")
print(f"p-value: {p_val:.4f}")

R Implementation

cor.test(x, y, method = "spearman")

Worked Numerical Example

Contest Rankings

Data: 5 Participants.

  • Judge A Ranks: [1, 2, 3, 4, 5]
  • Judge B Ranks: [1, 3, 2, 5, 4]

Differences (d):

  • 11=0,23=1,32=1,45=1,54=1
  • Squared diffs (d2): 0,1,1,1,1. Sum = 4.

Calculation:

  • ρ=16×45(251)=124120=10.2=0.8.

Interpretation: Strong positive agreement (ρ=0.8) between the two judges.


Interpretation Guide

Scenario Interpretation Edge Case Notes
ρ=0.9 Strong positive monotonic relationship. X increases Y increases.
ρ=0.6 Moderate negative monotonic relationship. X increases Y decreases.
Pearson r=0.4, Spearman ρ=0.75 Relationship is monotonic but non-linear (e.g., exponential). Spearman is better metric here.
ρ=0 No monotonic relationship. Could still be non-monotonic (U-shape).

Common Pitfall Example

The "Ties" Trap

Scenario: Analyzing Customer Satisfaction (1-5 scale).
Data: Thousands of customers, only 5 possible values (many ties).

Problem:

  • The standard formula ρ=16d2n(n21) assumes no ties.
  • With heavy ties, this formula is inaccurate.

Solution:

  • Use software (Python/R) which automatically uses the complicated "tie-corrected" formula.
  • Do not calculate manually using the simplified formula for Likert scale data.