MRPC: An R Package for Inference of Causal Graphs.

Md Bahadur Badsha,Audrey Qiuyan Fu,Evan A Martin

doi:10.3389/fgene.2021.651812

Md Bahadur Badsha, Audrey Qiuyan Fu + Show 1 more

Open Access

https://doi.org/10.3389/fgene.2021.651812

Copy DOI

Journal: Frontiers in genetics	Publication Date: Apr 30, 2021
Citations: 5	License type: CC BY 4.0

Affiliation: University of Idaho

Abstract

Understanding the causal relationships between variables is a central goal of many scientific inquiries. Causal relationships may be represented by directed edges in a graph (or equivalently, a network). In biology, for example, gene regulatory networks may be viewed as a type of causal networks, where X→Y represents gene X regulating (i.e., being causal to) gene Y. However, existing general-purpose graph inference methods often result in a high number of false edges, whereas current causal inference methods developed for observational data in genomics can handle only limited types of causal relationships. We present MRPC (a PC algorithm with the principle of Mendelian Randomization), an R package that learns causal graphs with improved accuracy over existing methods. Our algorithm builds on the powerful PC algorithm (named after its developers Peter Spirtes and Clark Glymour), a canonical algorithm in computer science for learning directed acyclic graphs. The improvements in MRPC result in increased accuracy in identifying v-structures (i.e., X→Y←Z), and robustness to how the nodes are arranged in the input data. In the special case of genomic data that contain genotypes and phenotypes (e.g., gene expression) at the individual level, MRPC incorporates the principle of Mendelian randomization as constraints on edge direction to help orient the edges. MRPC allows for inference of causal graphs not only for general purposes, but also for biomedical data where multiple types of data may be input to provide evidence for causality. The R package is available on CRAN and is a free open-source software package under a GPL (≥2) license.

Highlights

Graphical models provide a powerful mathematical framework to represent dependence among variables
Existing methods for inference of Directed Acyclic Graph (DAG) or the equivalent classes fall into three broad classes (Scutari, 2010) (i) constraintbased methods (Tsamardinos et al, 2003; Kalisch and Bühlmann, 2007; Colombo and Maathuis, 2014), which perform statistical tests of marginal and conditional independence for pairs of nodes; (ii) scored-based methods (Peters et al, 2011; Mooij et al, 2016; Nowzohour and Bühlmann, 2016), which optimize the search according to a score function; and (iii) hybrid methods (Tsamardinos et al, 2006) that combine the former two approaches
(b) Adjusted Structural Hamming Distance: The SHD, as implemented in pcalg and bnlearn, counts how many differences exist between two directed graphs

Summary

INTRODUCTION

Graphical models provide a powerful mathematical framework to represent dependence among variables. The canonical causal model (see M1 in Figure 1), X→Y→Z, where X is the instrumental variable, Y the exposure and Z the outcome, underlies most of the existing causal inference methods for genomic data based on the PMR (e.g., Didelez and Sheehan, 2007; Lawlor et al, 2008; Millstein et al, 2009; Smith and Hemani, 2014; Millstein et al, 2016; Wang and Michoel, 2017; Yang et al, 2017; Hemani et al, 2018; Verbanck et al, 2018; Howey et al, 2020; Zhao et al, 2020) Whereas these methods use the genetic variant as the instrumental variable to account for unobserved confounding, we assume causal sufficiency, i.e., confounding variables are fully observed and may be incorporated into the network inference (Spirtes et al, 2000). Our package further provides alternative approaches to graph visualization and graph comparison that are unavailable in the bnlearn and pcalg packages

METHOD

RESULTS

DISCUSSION

DATA AVAILABILITY STATEMENT

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MRPC: An R Package for Inference of Causal Graphs.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in genetics

Lead the way for us

Similar Papers

Decision letter: Applying causal discovery to single-cell analyses using CausalCell
Babak Momeni ... Anna Akhmanova
-
Babak Momeni, et. al.Babak Momeni ... Anna Akhmanova
14 Aug 2022
14 Aug 2022

Author response: Applying causal discovery to single-cell analyses using CausalCell
Yujian Wen ... Hai Zhang
-
Yujian Wen, et. al.Yujian Wen ... Hai Zhang
23 Aug 2022
23 Aug 2022

Integrative Gene Regulatory Network inference using multi-omics data
Neda Zarayeneh ... Donghyun Kim
-
Neda Zarayeneh, et. al.Neda Zarayeneh ... Donghyun Kim
01 Dec 2016
01 Dec 2016

Discovering Candidates for Gene Network Expansion by Distributed Volunteer Computing
...
-
, et. al. ...
20 Aug 2015
20 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MRPC: An R Package for Inference of Causal Graphs.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in genetics