Multinomial Logistic Regression (MNLogit)
Multinomial Logistic Regression
Definition
Multinomial Logistic Regression (MNLogit) extends Binary Logistic Regression to handle outcomes with more than two unordered categories (e.g., Transportation Mode: Car, Bus, Bike). It models the probability of each category relative to a reference category.
Purpose
- Model nominal (unordered) multi-class outcomes.
- Understand which factors predict category membership.
- Calculate predicted probabilities for each class.
When to Use
- Outcome has 3+ unordered categories.
- Predictors can be continuous, categorical, or mixed.
- Categories are mutually exclusive and exhaustive.
- Outcome is ordinal (use Ordinal Logistic Regression).
- Outcome is binary (use Binary Logistic Regression).
Theoretical Background
The Model
For
Relative Risk Ratio (RRR)
"A 1-unit increase in
Independence of Irrelevant Alternatives (IIA)
MNLogit assumes that the odds of choosing between any two categories are independent of other categories. (e.g., Preference for Car vs Bus is unaffected by adding a new "Train" option).
If violated, use Nested Logit or Mixed Logit.
Assumptions
Limitations
- IIA Violation: If adding a new category changes the relative odds of existing categories, the model is misspecified.
- Complexity: Interpretation requires
sets of coefficients. - Sparse Categories: Rare categories can cause estimation problems.
Python Implementation
import statsmodels.api as sm
# Fit MNLogit
X = sm.add_constant(df[['income', 'distance']])
y = df['transport_mode'] # Categories: 'car', 'bus', 'bike'
model = sm.MNLogit(y, X).fit()
print(model.summary())
# Relative Risk Ratios
import numpy as np
print("\n--- Relative Risk Ratios ---")
print(np.exp(model.params))
R Implementation
library(nnet)
# Fit MNLogit (multinom function)
# Relevel to set reference category
df$transport_mode <- relevel(factor(df$transport_mode), ref = "car")
model <- multinom(transport_mode ~ income + distance, data = df)
summary(model)
# Relative Risk Ratios
exp(coef(model))
# Confidence Intervals (z-test)
z <- summary(model)$coefficients / summary(model)$standard.errors
p <- (1 - pnorm(abs(z), 0, 1)) * 2
print(p)
Worked Numerical Example
Outcome: How do students commute? (Walk, Bike, Car)
Predictors: Distance_to_School (miles), Income ($1000s)
Base Category: Walk
Results:
| Outcome | Predictor | Coefficient | RRR | Interpretation |
|---|---|---|---|---|
| Walk (base) | - | 0 | 1.0 | Reference category |
| Bike vs Walk | Distance | 0.15 | 1.16 | Each mile → 16% more likely to bike than walk |
| Bike vs Walk | Income | -0.02 | 0.98 | Higher income → slightly less likely to bike |
| Car vs Walk | Distance | 0.40 | 1.49 | Each mile → 49% more likely to drive than walk |
| Car vs Walk | Income | 0.08 | 1.08 | Each $1K income → 8% more likely to drive |
Predictions for a student 5 miles away, income $40K:
- P(Walk) = 0.15
- P(Bike) = 0.25
- P(Car) = 0.60
- Most likely choice: Car
Interpretation Guide
| Output | Interpretation | Edge Case Notes |
|---|---|---|
| RRR (Car vs Walk) = 2.5 | Car 2.5× more likely than Walk per unit increase. | Relative to base only! Doesn't tell you Car vs Bike. |
| RRR = 1.0 | No effect on relative odds between categories. | Coefficient ≈ 0 (exactly 0 if no effect). |
| RRR = 0.5 | Outcome 50% less likely than base. | RRR < 1 means negative association. |
| All RRRs > 1 for one predictor | Predictor increases all non-base categories. | Common for variables like "service quality" in choice models. |
| Base category has P = 0.90 | Model may be unstable (rare outcomes). | Consider combining rare categories or multinomial ordered logit. |
Common Pitfall Example
Classic Example:
- Initial choices: Car (70%), Bus (30%)
- MNLogit assumes: P(Car)/P(Bus) = constant regardless of other options
IIA Assumption: If you add a "Red Bus" option:
- IIA predicts: Car (70%), Blue Bus (15%), Red Bus (15%)
- Reality: Car (70%), Blue Bus (20%), Red Bus (10%)
- Why? Red/Blue buses are close substitutes, violate independence!
Real Scenario:
- Brand choice: Coke, Pepsi, Store Brand
- Add "Diet Coke" → should draw more from Coke than Pepsi
- But MNLogit assumes all alternatives equally affected
Test IIA:
- Hausman test: Fit model with all options, then drop one
- If coefficients change substantially → IIA violated
Solutions if IIA violated:
- Nested Logit (group similar alternatives)
- Mixed Logit (random coefficients)
- Multinomial Probit (no IIA assumption)
When IIA is OK:
- Alternatives are truly distinct (Car, Bike, Walk)
- No obvious substitution patterns
Interpretation Guide
| Comparison | Coef | RRR | Interpretation |
|---|---|---|---|
| Bus vs Car (Income) | -0.5 | 0.61 | Higher income decreases odds of choosing Bus over Car by 39%. |
| Bike vs Car (Distance) | -1.2 | 0.30 | Longer distance decreases odds of Bike over Car by 70%. |
Related Concepts
- Binary Logistic Regression
- Ordinal Logistic Regression - For ordered outcomes.
- Generalized Linear Models (GLM)