ARIMA Models

Definition

Core Statement

ARIMA (AutoRegressive Integrated Moving Average) is a class of models for forecasting stationary time series. It combines three components:

AR(p): Autoregressive (past values).
I(d): Integrated (differencing for stationarity).
MA(q): Moving Average (past errors).

Purpose

Forecast future values of a time series.
Understand the temporal structure of data.
Benchmark model for univariate forecasting.

When to Use

Use ARIMA When...

Data is a univariate time series.
Series is stationary (or can be made so via differencing).
Goal is short-term forecasting.

Alternatives

Multiple predictors: Use Vector Autoregression (VAR) or regression.
Volatility clustering: Use GARCH Models.
Seasonality: Use SARIMA (Seasonal ARIMA).

Theoretical Background

The Components

Component	Notation	Meaning
AR(p)	$ϕ_{1} Y_{t - 1} + \dots + ϕ_{p} Y_{t - p}$	Regress on past values.
I(d)	$\nabla^{d} Y_{t}$	Difference $d$ times to achieve stationarity.
MA(q)	$θ_{1} ε_{t - 1} + \dots + θ_{q} ε_{t - q}$	Regress on past errors.

Model Equation (ARIMA(p,d,q))

\nabla^{d} Y_{t} = c + ϕ_{1} Y_{t - 1} + \dots + ϕ_{p} Y_{t - p} + ε_{t} + θ_{1} ε_{t - 1} + \dots + θ_{q} ε_{t - q}

Identification (Box-Jenkins Method)

Plot ACF/PACF: Use patterns to guess $p$ and $q$ .
Test Stationarity: Stationarity (ADF & KPSS). Apply differencing if needed.
Fit Model: Estimate parameters.
Diagnose Residuals: Should be white noise (no autocorrelation).
Forecast.

Assumptions

Stationarity: After differencing, the series has constant mean and variance.
No Structural Breaks.
Residuals are White Noise: Check with Ljung-Box test.

Limitations

Pitfalls

Requires Stationarity: Non-stationary series must be differenced.
Univariate: Does not incorporate external predictors. Use ARIMAX or VAR.
Short-Term Forecasts Only: Long-term forecasts revert to the mean.
Manual Order Selection: Box-Jenkins requires expertise. Use auto.arima() in R.

Python Implementation

import pmdarima as pm
from statsmodels.tsa.arima.model import ARIMA
import matplotlib.pyplot as plt

# Auto-ARIMA (Automatic Order Selection)
auto_model = pm.auto_arima(series, seasonal=False, stepwise=True, trace=True)
print(auto_model.summary())

# Manual ARIMA
model = ARIMA(series, order=(2, 1, 2)).fit()
print(model.summary())

# Forecast
forecast = model.get_forecast(steps=10)
print(forecast.predicted_mean)
forecast.plot_predict()
plt.show()

R Implementation

library(forecast)

# Auto-ARIMA (Best practice)
fit <- auto.arima(ts_data)
summary(fit)

# Forecast
fc <- forecast(fit, h = 12)
plot(fc)

# Check Residuals (Should show no autocorrelation)
checkresiduals(fit)

Worked Numerical Example

Forecasting Weekly Sales

Data: 100 weeks of sales data.
Process:

Visual Check: Trend is upward (Non-stationary).
Differencing (d=1): Values become stationary ( $Δ Y_{t} = Y_{t} - Y_{t - 1}$ ).
ACF Plot: Sharp cutoff after Lag 1. Suggests MA(1).
PACF Plot: Exponential decay. Suggests MA(1).

Model: ARIMA(0, 1, 1).

Equation: $Y_{t} - Y_{t - 1} = θ_{1} ϵ_{t - 1} + ϵ_{t}$ .
This is "Simple Exponential Smoothing".

Forecast: Next week's sales are a weighted average of recent sales, with more weight on the most recent week.

Interpretation Guide

Output	Interpretation	Edge Case Notes
ARIMA(1,1,1)	1 AR term, 1 diff, 1 MA term.	Standard robust baseline.
ARIMA(0,1,0)	"Random Walk". Forecast = Last Value.	Common for stock prices. Hard to beat.
AIC = 300 vs 350	AIC 300 is superior.	Improvement > 2 is significant.
p-value > 0.05 (Ljung-Box)	Residuals are white noise (Good).	Model has captured all signal.
Coef close to 1	Unit root issue?	Series might need more differencing.

Common Pitfall Example

The Stock Price Fallacy

Scenario: Trying to forecast Google Stock Price ( $P_{t}$ ) using ARIMA.

Mistake: Fit ARIMA(2,1,2). Get great "in-sample" fit ( $R^{2} = 0.99$ ).

Reality Check:

Plotting the forecast shows it just lags the real price by 1 day.
The best predictor of tomorrow's price is often today's price (Random Walk).
ARIMA cannot predict "shocks" or news.

Lesson: ARIMA works best for inertial systems (sales, temperature, inventory), not efficient markets (stocks).

Stationarity (ADF & KPSS) - Prerequisite.
Vector Autoregression (VAR) - Multivariate extension.
GARCH Models - For volatility.
Granger Causality - Testing predictive relationships.