MANOVA
MANOVA
Overview
Definition
MANOVA (Multivariate Analysis of Variance) is an extension of ANOVA that tests for the difference in two or more vectors of means. It is used when there are multiple dependent variables that are correlated.
1. Why Not Multiple ANOVAs?
Running separate ANOVAs for each dependent variable causes two problems:
- Type I Error Inflation: Multiple testing problem.
- Ignoring Correlations: Separate ANOVAs ignore the relationship between dependent variables. MANOVA might detect a difference in the joint distribution that individual ANOVAs miss.
2. Hypotheses
: The mean vectors across groups are equal. : At least one group differs on at least one dependent variable.
Worked Example: Plant Growth
Problem
You test 3 fertilizers (A, B, C) on plants.
Outcomes (DVs):
- Height (cm)
- Weight (g)
If you run two separate ANOVAs:
- Height:
(Not Sig). - Weight:
(Not Sig).
Run MANOVA:
- Plants with Fertilizer A are Tall and Skinny.
- Plants with Fertilizer B are Short and Fat.
- Plants with Fertilizer C are Tall and Fat.
MANOVA Result:
Reasoning: Even though the marginal distributions (Height alone, Weight alone) overlap, the joint clusters in 2D space are distinct. MANOVA captures this correlation.
3. Test Statistics
Since we are comparing matrices (Variance-Covariance matrices), there is no single F-value. Common statistics include:
- Wilks' Lambda: Most common. Ratio of determinants (
). Values close to 0 indicate groups are distinct. - Pillai's Trace: Consider the most robust to violations (e.g., small sample, unequal variance). Use this if assumptions are shaky.
- Hotelling-Lawley Trace: Good for two groups (generalizes T-test).
- Roy's Largest Root: Upper bound; most powerful if difference is on only one dimension, but sensitive to assumptions.
4. Assumptions
- Multivariate Normality: Dependent variables should be normally distributed within groups.
- Homogeneity of Covariance Matrices: Box's M Test (sensitive).
- Linearity: Linear relationships between DVs.
- No Multicollinearity: DVs shouldn't be too highly correlated (
).
5. Python Implementation Example
from statsmodels.multivariate.manova import MANOVA
import pandas as pd
# Formula interface: DV1 + DV2 ~ Group
model = MANOVA.from_formula('Sepal_Length + Sepal_Width + Petal_Length ~ Species', data=df)
print(model.mv_test())
6. Related Concepts
- One-Way ANOVA - Univariate version.
- Hotelling's T-Squared - Two-group multivariate test (Multivariate t-test).
- Principal Component Analysis (PCA) - Can be used to reduce DVs before analysis.