Bayes' Theorem

Definition

Core Statement

Bayes' Theorem is a fundamental result in probability theory that describes how to update beliefs in light of new evidence. It provides the mathematical foundation for Bayesian Statistics and probabilistic reasoning.

Purpose

Calculate conditional probabilities (reverse probabilities).
Update prior beliefs with new data to obtain posterior beliefs.
Foundation for diagnostic tests, spam filters, and Bayesian inference.

When to Use

Use Bayes' Theorem When...

You need to reverse a conditional probability (e.g., $P (A | B)$ from $P (B | A)$ ).
Integrating prior knowledge with observed data.
Medical diagnosis (disease given test result).
Bayesian inference in statistics.

Theoretical Background

The Formula

P (A | B) = \frac{P (B | A) \cdot P (A)}{P (B)}

Term	Name	Meaning
$P (A \| B)$	Posterior	Probability of $A$ after observing $B$ .
$P (B \| A)$	Likelihood	Probability of observing $B$ given $A$ .
$P (A)$	Prior	Probability of $A$ before observing $B$ .
$P (B)$	Evidence	Total probability of $B$ (normalizing constant).

Extended Form (Law of Total Probability)

P (A | B) = \frac{P (B | A) \cdot P (A)}{P (B | A) \cdot P (A) + P (B | \neg A) \cdot P (\neg A)}

Classic Example: Medical Diagnosis

Disease Testing

Disease prevalence: $P (D i s e a s e) = 0.01$ (1%).
Test sensitivity: $P (P o s i t i v e | D i s e a s e) = 0.95$ (95% true positive rate).
Test specificity: $P (N e g a t i v e | N o D i s e a s e) = 0.90$ (90% true negative rate).
Question: If someone tests positive, what is $P (D i s e a s e | P o s i t i v e)$ ?

Solution:

P (D i s e a s e | P o s i t i v e) = \frac{P (P o s | D i s) \cdot P (D i s)}{P (P o s | D i s) \cdot P (D i s) + P (P o s | N o D i s) \cdot P (N o D i s)}

= \frac{0.95 \times 0.01}{0.95 \times 0.01 + 0.10 \times 0.99} = \frac{0.0095}{0.0095 + 0.099} \approx 0.087

Result: Only 8.7% chance of actually having the disease, despite a positive test. (Due to low base rate).

Example 2: Spam Filter

"Free Money" Filter

A spam filter looks for the word "Free".

Prior: 40% of all emails are Spam ( $P (S) = 0.40$ ), 60% are Ham ( $P (H) = 0.60$ ).
Likelihood (Spam): 80% of Spam emails contain "Free" ( $P (F | S) = 0.80$ ).
Likelihood (Ham): 10% of Ham emails contain "Free" ( $P (F | H) = 0.10$ ).

Question: If an email contains "Free", what is the probability it is Spam?

Solution:

P (S | F) = \frac{P (F | S) \cdot P (S)}{P (F | S) \cdot P (S) + P (F | H) \cdot P (H)}

P (S | F) = \frac{0.80 \times 0.40}{(0.80 \times 0.40) + (0.10 \times 0.60)}

P (S | F) = \frac{0.32}{0.32 + 0.06} = \frac{0.32}{0.38} \approx 84.2 %

Conclusion: The presence of the word "Free" increases the probability of being spam from 40% (Prior) to 84.2% (Posterior).

Assumptions

Probabilities are well-defined.
Events are properly conditioned.
Prior probabilities are available (or can be estimated).

Limitations

Pitfalls

Base Rate Neglect: People often ignore $P (A)$ and focus only on $P (B | A)$ . A rare disease with a 99% accurate test often yields more false positives than true positives.
The Prosecutor's Fallacy: Confusing $P (E v i d e n c e | I n n o c e n t)$ with $P (I n n o c e n t | E v i d e n c e)$ . Just because it's unlikely an innocent person would match the DNA (low likelihood), doesn't mean the probability they are innocent is low (posterior), if the prior probability of guilt is tiny.
Zero Priors (Dogmatism): If you assign $P (H y p o t h e s i s) = 0$ , no amount of evidence can ever change your mind. Bayesian updating requires non-zero priors for possibility.

Python Implementation

# Medical Test Example
P_disease = 0.01
P_pos_given_disease = 0.95
P_pos_given_no_disease = 0.10

# Bayes' Theorem
numerator = P_pos_given_disease * P_disease
denominator = (P_pos_given_disease * P_disease + 
               P_pos_given_no_disease * (1 - P_disease))

P_disease_given_pos = numerator / denominator
print(f"P(Disease | Positive Test): {P_disease_given_pos:.3f}")

R Implementation

# Medical Test Example
P_disease <- 0.01
P_pos_given_disease <- 0.95
P_pos_given_no_disease <- 0.10

# Bayes' Theorem
numerator <- P_pos_given_disease * P_disease
denominator <- (P_pos_given_disease * P_disease + 
                P_pos_given_no_disease * (1 - P_disease))

P_disease_given_pos <- numerator / denominator
cat("P(Disease | Positive Test):", round(P_disease_given_pos, 3), "\n")

Interpretation Guide

Result	Interpretation
Result	Interpretation
--------	----------------
Posterior > Prior	Evidence supports the hypothesis (Bayes Factor > 1).
Posterior < Prior	Evidence contradicts the hypothesis (Bayes Factor < 1).
Prior = 0	Dogmatism: Belief cannot be updated, regardless of evidence.
Posterior $\approx$ 1	Certainty: Evidence is so strong it overwhelms the prior (or prior was already high).

Bayesian Statistics - Statistical framework built on Bayes' Theorem.
Conditional Probability
Law of Total Probability
Sensitivity and Specificity

Bayes' Theorem

Definition

Purpose

When to Use

Theoretical Background

The Formula

Extended Form (Law of Total Probability)

Classic Example: Medical Diagnosis

Example 2: Spam Filter

Assumptions

Limitations

Python Implementation

R Implementation

Interpretation Guide

Related Concepts