Consider the following true stories:
- A lower percentage of Democrats than Republicans voted in favor of the 1964 Civil Rights Act, despite the fact that it was originally proposed by one Democratic president (Kennedy) and eventually signed by another (Lyndon Johnson).
- In the 1970s, women were admitted to graduate programs at Berkeley at much lower rates than men, leading to the threat of a gender discrimination lawsuit in 1973. Yet an exhaustive inquiry found no evidence of gender discrimination.
- Voters with incomes above $50,000 were more likely to vote for Trump in 2016 than voters with incomes below $50,000, yet political scholarship and conventional wisdom heavily attribute support for Trump to frustrated working-class voters.
- A 1986 study found that non-invasive kidney stone removal had a higher success rate than traditional open surgery, yet open surgery remained the standard procedure.
- Across the US economy, overall earnings have risen since 2000, even though earnings for every education bracket have declined.
How can it be that the numbers, used by intelligent people, tell opposing stories?
When the Whole Doesn’t Equal the Parts
These stories are examples of Simpson’s Paradox. In words, it can be divided into two fallacies:
- Fallacy of Division: What is true of the whole is not always true of the parts
- Fallacy of Composition: What is true of the parts is not always true of the whole
These two maxims hold true in every single one of the stories described above.
Consider the numbers from the story of the Civil Rights Act. In 1964, the numbers in the Senate looked like this:
Start for free
Join the thousands of companies using Fivetran to centralize and transform their data.