Propensity Score Matching (PSM)

Definition

Core Statement

Propensity Score Matching (PSM) is a quasi-experimental method that creates comparable treatment and control groups from observational data by matching units with similar propensity scores (the probability of receiving treatment given observed covariates). It mimics a randomized experiment by balancing confounders.

Purpose

Estimate Average Treatment Effect on the Treated (ATT) from observational data.
Reduce selection bias when treatment assignment is not random.
Create balanced groups for causal inference.

When to Use

Use PSM When...

Treatment assignment is not random (observational study).
You have data on covariates that predict treatment selection.
You want to estimate a causal effect without a natural experiment.

Limitations

Cannot address unobserved confounders. If unmeasured variables affect both treatment and outcome, PSM fails.

Theoretical Background

The Propensity Score

e (x) = P (Treatment = 1 | X)

Key Insight: Instead of matching on many covariates (curse of dimensionality), match on a single summary: the propensity score.

Matching Procedure

Estimate Scores: Fit Binary Logistic Regression with Treatment as outcome, covariates as predictors.
Match: Pair treated units with control units having similar scores.
Check Balance: Verify covariates are balanced after matching. (Standardized Mean Difference < 0.1).
Estimate Effect: Compare outcomes between matched treated and control groups.

Assumptions

Critical Assumptions

Conditional Independence Assumption (CIA): Given covariates $X$ , treatment assignment is independent of potential outcomes. (No unobserved confounders).
Common Support (Overlap): For every treated unit, there exists a control with a similar propensity score.

Assumptions Checklist

CIA: All confounders are observed and included. (Cannot be tested; relies on domain knowledge).
Common Support: Overlap exists. Check propensity score distributions.
Correct Model Specification: Logistic model for propensity score is correctly specified.

Limitations

Pitfalls

Unobserved Confounders: If a key variable is missing, estimates are biased.
Overlap Violations: If treated and control have very different characteristics, matching is impossible.
Sensitivity to Model: Propensity score model misspecification can bias results.

Python Implementation

from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import NearestNeighbors
import pandas as pd
import numpy as np

# 1. Estimate Propensity Scores
logit = LogisticRegression(max_iter=1000)
logit.fit(df[['Age', 'Income', 'Health']], df['Treated'])
df['ps'] = logit.predict_proba(df[['Age', 'Income', 'Health']])[:, 1]

# 2. Nearest Neighbor Matching
treated = df[df['Treated'] == 1]
control = df[df['Treated'] == 0]

nn = NearestNeighbors(n_neighbors=1).fit(control[['ps']])
distances, indices = nn.kneighbors(treated[['ps']])

matched_control = control.iloc[indices.flatten()].reset_index(drop=True)
matched_treated = treated.reset_index(drop=True)

# 3. Check Balance
print("Treated Mean Age:", matched_treated['Age'].mean())
print("Control Mean Age:", matched_control['Age'].mean())

# 4. Estimate ATT
att = matched_treated['Outcome'].mean() - matched_control['Outcome'].mean()
print(f"ATT: {att:.3f}")

R Implementation

library(MatchIt)

# 1. Matching
m_out <- matchit(Treated ~ Age + Income + Health, data = df, 
                 method = "nearest", distance = "glm")

# 2. Check Balance
summary(m_out)

# 3. Get Matched Data
matched <- match.data(m_out)

# 4. Estimate Effect (Regression on Matched Data)
model <- lm(Outcome ~ Treated + Age + Income + Health, data = matched)
summary(model)

Interpretation Guide

Output	Interpretation
Standardized Mean Diff < 0.1	Good balance after matching.
ATT = 5.2	On average, treated units have outcomes 5.2 units higher than matched controls.
Poor overlap (no matches)	Treated and control too different. Results unreliable.

Binary Logistic Regression - Estimates propensity score.
Instrumental Variables (IV) - Alternative for endogeneity.
Difference-in-Differences (DiD)

Propensity Score Matching (PSM)

Definition

Purpose

When to Use

Theoretical Background

The Propensity Score

Matching Procedure

Assumptions

Assumptions Checklist

Limitations

Python Implementation

R Implementation

Interpretation Guide

Related Concepts