Vector Autoregression (VAR)

Vector Autoregression (VAR)

Overview

Definition

VAR (Vector Autoregression) is a stochastic process model used to capture the linear interdependencies among multiple time series. It generalizes the univariate autoregressive (AR) model by allowing for more than one evolving variable.


1. Logic of VAR

In a VAR model, each variable depends on:

  1. Its own past values (lags).
  2. The past values of all other variables in the system.

Example (2 variables):

Y1,t=c1+ϕ11Y1,t1+ϕ12Y2,t1+e1,tY2,t=c2+ϕ21Y1,t1+ϕ22Y2,t1+e2,t

This treats all variables as endogenous (simultaneously determined).


2. Prerequisites

  1. Stationarity: All variables in the system must be stationary. If they are I(1) (integrated/trending) but cointegrated, use VECM (Vector Error Correction Model) instead.
  2. Granger Causality: Often used to verify if one series actually helps predict the other.
  3. Lag Selection: Use Information Criteria (AIC, BIC) to determine the optimal lag length.

3. Impulse Response Functions (IRF)

The coefficients in VAR are hard to interpret directly. Instead, we use IRF:


4. Python Implementation Example

from statsmodels.tsa.api import VAR
import pandas as pd

# Data: DataFrame with multiple columns
# df = pd.read_csv(...) 

model = VAR(df)

# 1. Select Lag Order
lag_order = model.select_order(maxlags=15)
print(lag_order.summary())

# 2. Fit Model
results = model.fit(maxlags=2)
print(results.summary())

# 3. Impulse Response Analysis
irf = results.irf(10)
irf.plot(orth=False)

# 4. Granger Causality
print(results.test_causality('TargetVar', 'CauseVar'))