Abstract

SummaryWe present a computational method to infer causal mechanisms in cell biology by analyzing changes in high-throughput proteomic profiles on the background of prior knowledge captured in biochemical reaction knowledge bases. The method mimics a biologist's traditional approach of explaining changes in data using prior knowledge but does this at the scale of hundreds of thousands of reactions. This is a specific example of how to automate scientific reasoning processes and illustrates the power of mapping from experimental data to prior knowledge via logic programming. The identified mechanisms can explain how experimental and physiological perturbations, propagating in a network of reactions, affect cellular responses and their phenotypic consequences. Causal pathway analysis is a powerful and flexible discovery tool for a wide range of cellular profiling data types and biological questions. The automated causation inference tool, as well as the source code, are freely available at http://causalpath.org.

Highlights

  • Central to a cell’s decision-making processes is a vast network of biochemical reactions

  • Considering that the vast majority of currently available proteomic experiments have either few perturbations or only uncontrolled variation, it is very important that we use prior knowledge in its full potential. In this perturbation-poor setting, model-building activity is transformed into selecting parts of the prior knowledge that can best explain the shape of the data, which we call ‘‘pathway extraction.’’ Here, we present a pathway extraction method, CausalPath, which uses the rich semantics of curated pathway knowledge, including the type of mechanism, the direction, signs of effect, and post-translational modifications

  • We demonstrate the value of Causal-Path on multiple publicly available datasets covering a wide range of scenarios and biological questions: in a set of timeresolved epidermal growth factor (EGF) stimulation experiments we detected EGFR activation with its signaling downstream of MAPKs, including feedback inhibition on EGFR; from ligand-induced and drug-inhibited cell-line experiments, we estimated the precision of CausalPath predictions; from CPTAC (Clinical Proteomic Tumor Analysis Consortium) protein mass spectrometry datasets for ovarian and breast cancer we elucidated general and subtype-specific signaling, as well as regulators of well-known cancer proteins; and in RPPA (Reverse Phase Protein Array) experimental datasets of 32 TCGA (The Cancer Genome Atlas) cancer studies we found a core signaling network that is recurrently identified across many cancer types

Read more

Summary

Introduction

Central to a cell’s decision-making processes is a vast network of biochemical reactions. Article interconnected pathway models through the curation of reactions based on carefully designed low-throughput controlled experiments. This classic approach led to the first large-scale metabolic maps and later was extended to signaling and transcriptional processes. Today this knowledge is represented in hundreds of pathway and interaction databases The newer, data-driven inference approach leverages the recent developments in proteomics and other molecular technologies to directly infer graphical models, ab initio, from highthroughput measurements of controlled perturbations and natural variation.[1,2,3]

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call