Bayesian Statistics via Probabilistic Programming
Bayesian Statistics via Probabilistic Programming
Overview
This note illustrates the practical application of Bayesian Statistics using modern Probabilistic Programming frameworks. It bridges abstract Bayesian theory with computational implementation for real-world inference.
1. Bayesian A/B Testing
The Problem
Traditional frequentist approaches rely on p-values and binary decision rules. The Bayesian approach directly estimates the probability that one variant is superior to another (
Implementation Example
import pymc as pm
import numpy as np
# Sample Data
visitors_A, conversions_A = 1000, 50
visitors_B, conversions_B = 1000, 65
with pm.Model() as ab_test:
# Priors: Uniformative Beta priors (equivalent to Uniform[0,1])
p_A = pm.Beta('p_A', alpha=1, beta=1)
p_B = pm.Beta('p_B', alpha=1, beta=1)
# Likelihoods: Binomial distribution
obs_A = pm.Binomial('obs_A', n=visitors_A, p=p_A, observed=conversions_A)
obs_B = pm.Binomial('obs_B', n=visitors_B, p=p_B, observed=conversions_B)
# Deterministic: Difference
delta = pm.Deterministic('delta', p_B - p_A)
# Inference: MCMC Sampling
trace = pm.sample(draws=2000, tune=1000)
# Analysis
prob_b_better = (trace.posterior['delta'] > 0).mean()
print(f"Probability B is better than A: {prob_b_better:.1%}")
2. Bayesian Linear Regression
Unlike OLS, which provides point estimates, Bayesian regression yields full posterior distributions for coefficients, allowing for robust uncertainty quantification.
with pm.Model() as linear_model:
# Priors for unknown model parameters
alpha = pm.Normal('alpha', mu=0, sigma=10)
beta = pm.Normal('beta', mu=0, sigma=10)
sigma = pm.HalfNormal('sigma', sigma=1)
# Expected value of outcome
mu = alpha + beta * X
# Likelihood (Sampling distribution) of observations
y_obs = pm.Normal('y', mu=mu, sigma=sigma, observed=y)
# Inference
trace = pm.sample()
3. Hierarchical Examples
Hierarchical (Multilevel) Models are a hallmark of Bayesian analysis, allowing information sharing (partial pooling) across groups.
with pm.Model() as hierarchical_model:
# Hyperpriors (Population level)
mu_global = pm.Normal('mu_global', mu=0, sigma=10)
sigma_global = pm.HalfNormal('sigma_global', sigma=5)
# Group-level parameters (Subject to population prior)
mu_group = pm.Normal('mu_group', mu=mu_global, sigma=sigma_global, shape=n_groups)
# Likelihood using group-specific parameters
sigma = pm.HalfNormal('sigma', sigma=1)
pm.Normal('y', mu=mu_group[group_idx], sigma=sigma, observed=y)
trace = pm.sample()
4. Model Workflow and Checking
-
Prior Predictive Checks: Simulate data from priors before observing actual data to ensure assumptions are reasonable.
with model: idata_prior = pm.sample_prior_predictive() -
Convergence Diagnostics: Check
(R-hat) and Effective Sample Size (ESS). -
Posterior Predictive Checks: Compare simulated data from the fitted model against observed data to assess fit.
with model: pm.sample_posterior_predictive(trace, extend_inferencedata=True)
5. Related Concepts
- Bayesian Statistics - Theoretical foundations.
- Probabilistic Programming - Framework details.
- Likelihood Function - Core component.