Abstract

Forthcoming, Statistics in Medicine, Volume 32, Number 25, November 2013. TECHNICAL REPORT R-412 June 2013 Comment on “Causal inference, probability theory, and graphical insights” by Stuart G. Baker Judea Pearl University of California, Los Angeles Computer Science Department Los Angeles, CA, 90095-1596, USA judea@cs.ucla.edu Modern causal inference owes much of its progress to a strict and crisp distinction between probabilistic and causal information. This distinction recognizes that prob- ability theory is insufficient for posing causal questions, let alone answering them, and dictates that every exercise in causal inference must commence with some extra knowledge that cannot be expressed in probability alone. 1 The paper by Baker at- tempts to overturn this distinction and argues that “probability theory is a desirable and sufficient basis for many topics in causal inference.” My comments will disprove Baker’s claim, in the hope of convincing readers of the importance of keeping the boundaries between probabilistic and causal concepts crisp and visible. Baker’s argument begins with: “...besides explaining such causal graph topics as M -bias (adjusting for a collider) and bias amplification and attenuation (when adjusting for instrumental variable), probability theory is also the foundation of the paired availability design for historical control” (abstract). While I am not versed in the intricacies of “paired availability design” (Google Scholar lists only a handful of entries in this category), I doubt it can be based solely on probabilities. Indeed, Baker himself resorts to counterfactuals and other non-probabilistic notions 2 in explaining the research questions a “paired availability design” attempts to answer. I am quite familiar however with the concepts of “M -bias,” “bias,” “Simpson’s paradox,” and “instrumental variable” which I will show to have no interpretation in probability theory alone. I will start with the concept of “instrumental variable” which should be familiar to most readers, and which is often mistaken to have probabilistic definition (see [2, pp. Cartwright [1] summarized this limitation in a well-known slogan: “no causes in, no causes out.” By “non-probabilistic notions” I mean relations or parameters that cannot be defined in terms of joint distributions of observed variables. The restriction to observed variable is important for, otherwise, everything would become probabilistic, including Cinderella’s age, horoscopic predictions, counterfactuals, latent variables, the answers to our research questions, and so on; we need merely hypothesize a distribution over such variables and turn every problem probabilistic. The distinction between causal in probability information would then lose its meaning and usefulness.

Highlights

  • Modern causal inference owes much of its progress to a strict and crisp distinction between probabilistic and causal information

  • This distinction recognizes that probability theory is insufficient for posing causal questions, let alone answering them, and dictates that every exercise in causal inference must commence with some extra knowledge that cannot be expressed in probability alone.[1]

  • I am quite familiar with the concepts of “M -bias,” “bias,” “Simpson’s paradox,” and “instrumental variable” which I will show to have no interpretation in probability theory alone

Read more

Summary

Introduction

Modern causal inference owes much of its progress to a strict and crisp distinction between probabilistic and causal information. Baker’s argument begins with: “...besides explaining such causal graph topics as M -bias (adjusting for a collider) and bias amplification and attenuation (when adjusting for instrumental variable), probability theory is the foundation of the paired availability design for historical control” (abstract).

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call