Misleading Epidemiological and Statistical Evidence in the Presence of Simpson's Paradox: An Illustrative Study Using Simulated Scenarios of Observational Study Designs.

Chanapong Rojanaworarit

doi:10.25122/jml-2019-0120

Abstract

This study empirically illustrates the mechanism by which epidemiological effect measures and statistical evidence can be misleading in the presence of Simpson's paradox and identify possible alternative methods of analysis to manage the paradox.Three scenarios of observational study designs, including cross-sectional, cohort, and case-control approaches, are simulated. In each scenario, data are generated, and various methods of epidemiological and statistical analyses are undertaken to obtain empirical results that illustrate Simpson's paradox and mislead conclusions. Rational methods of analysis are also performed to illustrate how to avoid pitfalls and obtain valid results.In the presence of Simpson's paradox, results from analyses in overall data contradict the findings from all subgroups of the same data. This paradox occurs when distributions of confounding characteristics are unequal in the groups being compared. Data analysis methods which do not take confounding factor into account, including epidemiological 2×2 table analysis, independent samples t-test, Wilcoxon rank-sum test, chi-square test, and univariable regression analysis, cannot manage the problem of Simpson's paradox and mislead research conclusions. Mantel-Haenszel procedure and multivariable regression methods are examples of rational analysis methods leading to valid results.Therefore, Simpson's paradox arises as a consequence of extreme unequal distributions of a specific inherent characteristic in groups being compared. Analytical methods which take control of confounding effect must be applied to manage the paradox and obtain valid research evidence regarding the causal association.

Full Text