Vector Autoregression (VAR)

Overview

Definition

VAR (Vector Autoregression) is a stochastic process model used to capture the linear interdependencies among multiple time series. It generalizes the univariate autoregressive (AR) model by allowing for more than one evolving variable.

1. Logic of VAR

In a VAR model, each variable depends on:

Its own past values (lags).
The past values of all other variables in the system.

Example (2 variables):

Y_{1, t} = c_{1} + ϕ_{11} Y_{1, t - 1} + ϕ_{12} Y_{2, t - 1} + e_{1, t}

Y_{2, t} = c_{2} + ϕ_{21} Y_{1, t - 1} + ϕ_{22} Y_{2, t - 1} + e_{2, t}

This treats all variables as endogenous (simultaneously determined).

2. Prerequisites

Stationarity: All variables in the system must be stationary. If they are $I (1)$ (integrated/trending) but cointegrated, use VECM (Vector Error Correction Model) instead.
Granger Causality: Often used to verify if one series actually helps predict the other.
Lag Selection: Use Information Criteria (AIC, BIC) to determine the optimal lag length.

3. Impulse Response Functions (IRF)

The coefficients in VAR are hard to interpret directly. Instead, we use IRF:

Question: "If $Y_{1}$ experiences a sudden shock (increase), how does $Y_{2}$ react over time?"
Plot: Shows the trajectory of the response over subsequent periods.

4. Python Implementation Example

from statsmodels.tsa.api import VAR
import pandas as pd

# Data: DataFrame with multiple columns
# df = pd.read_csv(...) 

model = VAR(df)

# 1. Select Lag Order
lag_order = model.select_order(maxlags=15)
print(lag_order.summary())

# 2. Fit Model
results = model.fit(maxlags=2)
print(results.summary())

# 3. Impulse Response Analysis
irf = results.irf(10)
irf.plot(orth=False)

# 4. Granger Causality
print(results.test_causality('TargetVar', 'CauseVar'))

ARIMA Models - Univariate case.
Granger Causality - Causality test within VAR.
Stationarity (ADF & KPSS) - Requirement.