Auto-Correlation (ACF & PACF)
Auto-Correlation (ACF & PACF)
Definition
Core Statement
ACF (Auto-Correlation Function) measures the correlation between a time series and its lagged values (e.g.,
PACF (Partial Auto-Correlation Function) measures the correlation between
These are the primary tools for identifying the order (p, q) of ARIMA Models.
Purpose
- Identify Seasonality: Spikes at regular intervals (e.g., every 12 lags for monthly data).
- Determine Model Order: Use the shape of ACF/PACF plots to choose
(AR) and (MA). - Residual Checking: Are the errors "White Noise"? (Ideally, ACF should be zero for all lags > 0).
The Rules of Thumb (Box-Jenkins)
| Plot | AR Process ( |
MA Process ( |
ARMA ( |
|---|---|---|---|
| ACF | Decays gradually (Geometric/Sinusoidal) | Cuts off after lag |
Decays gradually |
| PACF | Cuts off after lag |
Decays gradually | Decays gradually |
Mnemonic
- AR(p): Look at PACF. Significant spike at
, then zero. - MA(q): Look at ACF. Significant spike at
, then zero.
Worked Example: Identifying a Model
Problem
You plot ACF and PACF for a stationary series.
Observation 1 (PACF):
- Huge spike at Lag 1 (
). - Spike at Lag 2 is small/insignificant.
- Conclusion: This suggests AR(1).
Observation 2 (ACF):
- ACF starts high (0.8) and slowly decays (0.64, 0.51...).
- This confirms it is an AR process (gradual decay).
Model Proposal: ARIMA(1, 0, 0).
Assumptions
Python Implementation
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
# Load Data
# data = pd.read_csv(...)['Sales']
# Plot
fig, ax = plt.subplots(1, 2, figsize=(12, 4))
plot_acf(data, lags=20, ax=ax[0])
plot_pacf(data, lags=20, ax=ax[1])
plt.show()
# Interpretation:
# Blue shaded area is the 95% Confidence Interval.
# Anything outside the blue zone is statistically significant.
Common Pitfall
The "Intermediate" Trap
Why do we need PACF?
- If
causes , and causes ... - Then
will correlate with purely because of the chain reaction. - ACF shows this "echo" (Lag 2 is correlated).
- PACF removes the middleman (
) and shows the pure correlation of Lag 2. (Result: Zero). - Mistake: Using ACF to set AR order usually leads to picking a
that is way too high.
Related Concepts
- ARIMA Models - The model built using these tools.
- Stationarity (ADF & KPSS) - Prerequisite.
- Ljung-Box Test - Statistical test for "White Noise" (checking group of lags).