Two-Way ANOVA

Two-Way ANOVA

Definition

Core Statement

Two-Way ANOVA extends One-Way ANOVA to examine the effects of two independent categorical variables (factors) on a continuous outcome. It can detect main effects of each factor and their interaction effect.


Purpose

  1. Test if two factors independently affect the outcome (main effects).
  2. Test if the effect of one factor depends on the level of the other (interaction).
  3. More efficient than running multiple one-way ANOVAs.
  4. Foundation for factorial experimental designs.

When to Use

Use Two-Way ANOVA When...

  • You have two categorical independent variables (factors).
  • You have one continuous dependent variable.
  • You want to test for main effects and interaction.
  • Data meets ANOVA assumptions (normality, homogeneity of variance, independence).

Alternatives


Theoretical Background

The Model

Yijk=μ+αi+βj+(αβ)ij+εijk
Term Meaning
μ Grand mean
αi Main effect of Factor A (level i)
βj Main effect of Factor B (level j)
(αβ)ij Interaction effect between A and B
εijk Random error

Three Hypotheses Tested

Test Null Hypothesis
Main Effect A Factor A has no effect (α1=α2==0)
Main Effect B Factor B has no effect (β1=β2==0)
Interaction A×B No interaction ((αβ)ij=0 for all i,j)

Interaction Effect

What is Interaction?

Interaction exists when the effect of Factor A depends on the level of Factor B.

Example: Studying effectiveness of Drug (A) and Diet (B) on weight loss.

  • No Interaction: Drug and Diet work independently; effects are additive.
  • Interaction: Drug only works when combined with Diet X (synergy).

Visualization: In an interaction plot, non-parallel lines indicate interaction.


Assumptions


Limitations

Pitfalls

  1. Significant Interaction complicates interpretation: If A×B is significant, main effects are often meaningless on their own. Focus on simple effects (effect of A at each level of B).
  2. Unbalanced designs: Unequal cell sizes complicate calculations and reduce power.
  3. Multiple comparisons: Post-hoc tests (Tukey's HSD) are needed if main effects are significant.


Python Implementation

import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols

# Example Data: Weight Loss by Diet and Exercise
data = {
    'WeightLoss': [5, 6, 7, 8, 3, 4, 5, 6, 7, 8, 9, 10, 4, 5, 6, 7],
    'Diet': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
    'Exercise': ['Low', 'Low', 'High', 'High', 'Low', 'Low', 'High', 'High'] * 2
}
df = pd.DataFrame(data)

# Fit Two-Way ANOVA
model = ols('WeightLoss ~ C(Diet) + C(Exercise) + C(Diet):C(Exercise)', data=df).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

print(anova_table)

# Interaction Plot
import matplotlib.pyplot as plt
grouped = df.groupby(['Diet', 'Exercise'])['WeightLoss'].mean().unstack()
grouped.plot(marker='o', figsize=(8, 5))
plt.title('Interaction Plot: Diet × Exercise')
plt.ylabel('Mean Weight Loss')
plt.xlabel('Diet')
plt.legend(title='Exercise')
plt.show()

R Implementation

# Example Data
df <- data.frame(
  WeightLoss = c(5, 6, 7, 8, 3, 4, 5, 6, 7, 8, 9, 10, 4, 5, 6, 7),
  Diet = factor(rep(c('A', 'B'), each = 8)),
  Exercise = factor(rep(c('Low', 'High'), 8))
)

# Two-Way ANOVA
model <- aov(WeightLoss ~ Diet * Exercise, data = df)
summary(model)

# Interaction Plot
interaction.plot(df$Diet, df$Exercise, df$WeightLoss,
                 col = c("red", "blue"), lwd = 2,
                 xlab = "Diet", ylab = "Mean Weight Loss",
                 trace.label = "Exercise")

# Post-Hoc (if main effects significant)
TukeyHSD(model)

Interpretation Guide

Result Interpretation
Diet: F=8.5, p=0.003 Main effect of Diet is significant.
Exercise: F=12.1, p<0.001 Main effect of Exercise is significant.
Diet×Exercise: F=0.8, p=0.39 No interaction. Effects are additive.
Diet×Exercise: F=6.2, p=0.02 Significant interaction. Effect of Diet depends on Exercise level. Analyze simple effects.