Beta Distribution
Beta Distribution
Definition
Core Statement
The Beta Distribution is a continuous distribution defined on the interval [0, 1]. It is parametrized by two shape parameters,
Purpose
- Bayesian Inference: Representing belief about a probability (e.g., "I think the conversion rate is between 2% and 5%").
- Modeling Rates: Click-through rates, batting averages, defect rates.
- Project Management: PERT distribution (Optimistic vs Pessimistic estimates).
Intuition: "Successes and Failures"
You can think of
: Number of Successes. : Number of Failures.
| Parameters | Shape | Interpretation |
|---|---|---|
| Beta(1, 1) | Flat (Uniform) | "I have no idea." (All probs equally likely). |
| Beta(2, 2) | Mound at 0.5 | "Weakly believe it's fair." |
| Beta(100, 100) | Sharp Spike at 0.5 | "Strongly believe it's fair." |
| Beta(10, 1) | Skewed Right (near 1) | "Almost certain to succeed." |
| Beta(0.5, 0.5) | U-Shape (bathtub) | "Either 0 or 1, but not middle." |
Worked Example: Batting Average
Problem
A new baseball player appears.
Estimate his batting average.
- Prior: League average is 0.260. We use a prior of Beta(81, 219) (Mean =
, effectively 300 "prior at-bats"). - Data: In his first game, he hits 1 out of 1. (100% average!).
- Naive Mean:
(Way too high).
Bayesian Update:
- New
. - New
. - Posterior Mean:
.
Conclusion: The massive weight of the prior ("He's a rookie") keeps the estimate grounded. One hit doesn't make him a god. This is regularization.
Key Properties
- Domain:
. - Mean:
. - Mode:
(for ).
Python Implementation
import numpy as np
from scipy.stats import beta
import matplotlib.pyplot as plt
x = np.linspace(0, 1, 100)
# 1. Uninformed Prior
y1 = beta.pdf(x, 1, 1)
# 2. Strong Belief in Fairness
y2 = beta.pdf(x, 50, 50)
# 3. Skewed Belief (Low prob)
y3 = beta.pdf(x, 2, 8)
plt.plot(x, y1, label='Beta(1,1) [Uniform]')
plt.plot(x, y2, label='Beta(50,50) [Peaked]')
plt.plot(x, y3, label='Beta(2,8) [Low Rate]')
plt.legend()
plt.title("Beta Distribution Shapes")
plt.show()
Related Concepts
- Bernoulli Distribution - Beta is derived from it.
- Bayesian Statistics - Heavy user of Beta.
- Conjugate Prior - Mathematical property making Beta useful.
- Dirichlet Distribution - Multivariate generalization (Beta for >2 categories).