Causality: Models, Reasoning and Inference


Suppose you survey students in your class and discover that a higher proportion of students who smoke received a final grade of A than do students who do not smoke. Possible data are displayed in Table 1: 50 percent of the 10 smokers received an A, and only 40 percent of the five nonsmokers received an A. Puzzled by the seeming implication that smoking improves grades, you partition the same data differently, looking at students with high parental income (Table 3) separately from those with low parental income (Table 2). And you find even more surprisingly that the trend has been reversed: smoking lowers grades in both subpopulations. You have just encountered Simpson’s Paradox: “an event C [smoking] increases the probability of E [grade A] in a given population p, and, at the same time, decreases the probability of E in every subpopulation of p” (p.174). One of the great achievements of Judea Pearl’s work is to dispel the cloud of mystery enveloping Simpson’s Paradox for a century. He analyzes it so thoroughly that he even explains (p. 182) why we find the reversal in subpopulations paradoxical. The engine driving Simpson’s Paradox is causality, and the confusion derives from trying to understand it solely through statistics:

2 Figures and Tables

Citations per Year

7,258 Citations

Semantic Scholar estimates that this publication has 7,258 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Pearl2001CausalityMR, title={Causality: Models, Reasoning and Inference}, author={Judea Pearl and Joseph O 'rourke}, year={2001} }