Abstract

Simpson’s paradox—also called the reversal paradox and amalgamation paradox—is a statistical phenomenon in which an apparent paradox arises because aggregate data at the group level (or at the level of a set of groups) can support a conclusion that is either not observed or is opposite from that suggested by the same data before aggregation at the individual level (or at the level of groups). The paradox is resolved when the data are stratified by groups in the statistical modeling. An intuitive example of Simpson’s paradox is the correlation between typing speed and typos. At the group level, the correlation is negative—experienced typists type faster and make fewer typos. However, at the individual level, the correlation is positive—the faster an individual types, the greater the number of typos he/she makes. Thus, it would be fallacious to conclude that the relationship between typing speed and typos observed at the group level holds at the individual level. Simpson’s paradox is especially problematic in physical and social sciences, where statistical trends in point data observed at the group level are often fallaciously used to derive inferences about individuals, or relatively less often, the other way round. Hence, equivalence at the group and individual levels must be explicitly tested.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call