Bayes' Theorem

Bayes' Theorem

Definition

Core Statement

Bayes' Theorem is a fundamental result in probability theory that describes how to update beliefs in light of new evidence. It provides the mathematical foundation for Bayesian Statistics and probabilistic reasoning.


Purpose

  1. Calculate conditional probabilities (reverse probabilities).
  2. Update prior beliefs with new data to obtain posterior beliefs.
  3. Foundation for diagnostic tests, spam filters, and Bayesian inference.

When to Use

Use Bayes' Theorem When...

  • You need to reverse a conditional probability (e.g., P(A|B) from P(B|A)).
  • Integrating prior knowledge with observed data.
  • Medical diagnosis (disease given test result).
  • Bayesian inference in statistics.


Theoretical Background

The Formula

P(A|B)=P(B|A)P(A)P(B)
Term Name Meaning
P(A|B) Posterior Probability of A after observing B.
P(B|A) Likelihood Probability of observing B given A.
P(A) Prior Probability of A before observing B.
P(B) Evidence Total probability of B (normalizing constant).

Extended Form (Law of Total Probability)

P(A|B)=P(B|A)P(A)P(B|A)P(A)+P(B|¬A)P(¬A)

Classic Example: Medical Diagnosis

Disease Testing

  • Disease prevalence: P(Disease)=0.01 (1%).
  • Test sensitivity: P(Positive|Disease)=0.95 (95% true positive rate).
  • Test specificity: P(Negative|NoDisease)=0.90 (90% true negative rate).
  • Question: If someone tests positive, what is P(Disease|Positive)?

Solution:

P(Disease|Positive)=P(Pos|Dis)P(Dis)P(Pos|Dis)P(Dis)+P(Pos|NoDis)P(NoDis)=0.95×0.010.95×0.01+0.10×0.99=0.00950.0095+0.0990.087

Result: Only 8.7% chance of actually having the disease, despite a positive test. (Due to low base rate).


Example 2: Spam Filter

"Free Money" Filter

A spam filter looks for the word "Free".

  • Prior: 40% of all emails are Spam (P(S)=0.40), 60% are Ham (P(H)=0.60).
  • Likelihood (Spam): 80% of Spam emails contain "Free" (P(F|S)=0.80).
  • Likelihood (Ham): 10% of Ham emails contain "Free" (P(F|H)=0.10).

Question: If an email contains "Free", what is the probability it is Spam?

Solution:

P(S|F)=P(F|S)P(S)P(F|S)P(S)+P(F|H)P(H)P(S|F)=0.80×0.40(0.80×0.40)+(0.10×0.60)P(S|F)=0.320.32+0.06=0.320.3884.2%

Conclusion: The presence of the word "Free" increases the probability of being spam from 40% (Prior) to 84.2% (Posterior).


Assumptions


Limitations

Pitfalls

  1. Base Rate Neglect: People often ignore P(A) and focus only on P(B|A). A rare disease with a 99% accurate test often yields more false positives than true positives.
  2. The Prosecutor's Fallacy: Confusing P(Evidence|Innocent) with P(Innocent|Evidence). Just because it's unlikely an innocent person would match the DNA (low likelihood), doesn't mean the probability they are innocent is low (posterior), if the prior probability of guilt is tiny.
  3. Zero Priors (Dogmatism): If you assign P(Hypothesis)=0, no amount of evidence can ever change your mind. Bayesian updating requires non-zero priors for possibility.


Python Implementation

# Medical Test Example
P_disease = 0.01
P_pos_given_disease = 0.95
P_pos_given_no_disease = 0.10

# Bayes' Theorem
numerator = P_pos_given_disease * P_disease
denominator = (P_pos_given_disease * P_disease + 
               P_pos_given_no_disease * (1 - P_disease))

P_disease_given_pos = numerator / denominator
print(f"P(Disease | Positive Test): {P_disease_given_pos:.3f}")

R Implementation

# Medical Test Example
P_disease <- 0.01
P_pos_given_disease <- 0.95
P_pos_given_no_disease <- 0.10

# Bayes' Theorem
numerator <- P_pos_given_disease * P_disease
denominator <- (P_pos_given_disease * P_disease + 
                P_pos_given_no_disease * (1 - P_disease))

P_disease_given_pos <- numerator / denominator
cat("P(Disease | Positive Test):", round(P_disease_given_pos, 3), "\n")

Interpretation Guide

Result Interpretation
Result Interpretation
-------- ----------------
Posterior > Prior Evidence supports the hypothesis (Bayes Factor > 1).
Posterior < Prior Evidence contradicts the hypothesis (Bayes Factor < 1).
Prior = 0 Dogmatism: Belief cannot be updated, regardless of evidence.
Posterior 1 Certainty: Evidence is so strong it overwhelms the prior (or prior was already high).