Survivorship Bias
Survivorship Bias
Definition
Core Statement
Survivorship Bias is the logical error of focusing on the people or things that "survived" some process and ignoring those that did not because of lack of visibility. This leads to false conclusions because the sample is not representative of the whole population.
Purpose
- Correct decision making: Avoid optimizing for the wrong traits.
- Investment analysis: Mutual fund performance looks better than it is because failed funds vanish.
- Startup/Success advice: "I dropped out of college and succeeded" ignores the millions who dropped out and failed.
The Classic Example: WWII Planes
Abraham Wald & The Bullet Holes
Scenario: The military analyzed planes returning from battle.
Data: Most bullet holes were found on the Wings and Tail.
Military's Plan: "Put more armor on the Wings and Tail!"
Wald's Insight:
- The planes you are looking at Returned. They survived.
- This means bullet holes in wings/tail are survivable.
- The planes that were hit in the Engine or Cockpit... never came back.
Conclusion: Armor the area where there are no bullet holes (the Engine). The missing data tells the story.
Common Scenarios
| Context | Bias | Resulting Fallacy |
|---|---|---|
| Finance | Analyzing indices (S&P 500) | "Stocks always go up" (You ignored Enron, Lehman Bros which were delisted). |
| History | "They don't make them like they used to" | You only see the old buildings that survived. The cheap ones collapsed long ago. |
| Business | "Do what Bill Gates did" | Ignores the luck factor and the silent graveyard of failed startups. |
How to Detect & Mitigation
Checklist
- Where are the failures? Am I seeing the full dataset or just the winners?
- Is the data censored? (e.g., Customer surveys only reach current customers, not those who churned in anger).
- Look for the invisible: Ask "Who is missing from this room?"
Related Concepts
- Selection Bias - The broader category.
- Missing Data - Techniques to handle mechanism of missingness.
- Sampling Bias