Fisher's Exact Test
Fisher's Exact Test
Definition
Fisher's Exact Test determines if there is a significant association between two categorical variables, specifically designed for small sample sizes where the Chi-Square test's approximations are unreliable. It calculates the exact probability of observing the data (or more extreme) under the null hypothesis of independence.
Purpose
- Test for association when expected cell counts are < 5.
- Provide exact p-values without relying on asymptotic approximations.
When to Use
- Contingency table is 2x2.
- Any expected cell count < 5.
- Sample size is small.
Fisher's Exact Test can be computationally intensive for large tables, but modern computers handle it easily. It is often used as a default for 2x2 tables regardless of cell counts.
Theoretical Background
Worked Example: Rare Side Effect
You test a new drug vs placebo.
- Treatment (
): 9 Healthy, 1 Side Effect. - Placebo (
): 5 Healthy, 5 Side Effects.
Question: Is the drug safer? (Does it reduce side effects?)
Chi-Square fails here because 1 cell has count 1, another has 5. We need Fisher's.
Solution:
Table:
| Side Effect | Healthy | Total | |
|---|---|---|---|
| Drug | 1 (a) | 9 (b) | 10 |
| Placebo | 5 (c) | 5 (d) | 10 |
| Total | 6 | 14 | 20 |
-
Calculate Probability of Observed Table:
Using Hypergeometric probability formula: -
Calculate More Extreme Tables:
- Table with 0 Side Effects in Drug group (Treatment even better).
- P(0 SE) =
.
-
Total One-Sided P-Value:
.
Conclusion: At
Theoretical Background
Hypergeometric Distribution
Fisher's test assumes the row and column totals are fixed. The probability of observing exactly
Where:
: Row totals. : Column 1 total. : Grand total.
Odds Ratio (Conditional MLE)
Fisher's test estimates the Conditional Maximum Likelihood Estimate of the Odds Ratio, which is more robust for small samples than the simple sample odds ratio (
Assumptions
Limitations
- Computationally Intensive for Large Tables: For tables larger than 2x2 with large counts, computation can be slow.
- Conservative: Fisher's test can be conservative (p-values slightly larger than necessary).
Python Implementation
from scipy.stats import fisher_exact
import numpy as np
# 2x2 Table
# Disease+ Disease-
# Exposed+ 8 2
# Exposed- 1 5
table = np.array([[8, 2], [1, 5]])
odds_ratio, p_val = fisher_exact(table)
print(f"Odds Ratio: {odds_ratio:.2f}")
print(f"p-value: {p_val:.4f}")
R Implementation
# 2x2 Table
tbl <- matrix(c(8, 2, 1, 5), nrow = 2, byrow = TRUE)
result <- fisher.test(tbl)
print(result)
# Output includes:
# - p-value
# - Odds Ratio
# - 95% CI for OR
Interpretation Guide
| Output | Interpretation |
|---|---|
| p < 0.05 | Significant association exists. |
| OR = 20 | Exposed group has 20x the odds of disease compared to unexposed. |
| OR = |
One cell is zero (e.g., No cases in treatment group). Perfect separation. |
| OR 95% CI excludes 1 | The effect is statistically significant. |
Related Concepts
- Chi-Square Test of Independence - For larger samples.
- Odds Ratio - Measure of effect.
- Effect Size Measures