Do-calculus enables estimation of causal effects in partially observed biomolecular pathways.

Sara Mohammad-Taheri,Karen Sachs,Charles Tapley Hoyt,Vartika Tewari,Robert Ness,Jeremy Zucker,Olga Vitek

doi:10.1093/bioinformatics/btac251

Abstract

MotivationEstimating causal queries, such as changes in protein abundance in response to a perturbation, is a fundamental task in the analysis of biomolecular pathways. The estimation requires experimental measurements on the pathway components. However, in practice many pathway components are left unobserved (latent) because they are either unknown, or difficult to measure. Latent variable models (LVMs) are well-suited for such estimation. Unfortunately, LVM-based estimation of causal queries can be inaccurate when parameters of the latent variables are not uniquely identified, or when the number of latent variables is misspecified. This has limited the use of LVMs for causal inference in biomolecular pathways.ResultsIn this article, we propose a general and practical approach for LVM-based estimation of causal queries. We prove that, despite the challenges above, LVM-based estimators of causal queries are accurate if the queries are identifiable according to Pearl’s do-calculus and describe an algorithm for its estimation. We illustrate the breadth and the practical utility of this approach for estimating causal queries in four synthetic and two experimental case studies, where structures of biomolecular pathways challenge the existing methods for causal query estimation.Availability and implementationThe code and the data documenting all the case studies are available at https://github.com/srtaheri/LVMwithDoCalculus.Supplementary information Supplementary data are available at Bioinformatics online.

Full Text