Abstract

BackgroundGene expression profiling and other genome-scale measurement technologies provide comprehensive information about molecular changes resulting from a chemical or genetic perturbation, or disease state. A critical challenge is the development of methods to interpret these large-scale data sets to identify specific biological mechanisms that can provide experimentally verifiable hypotheses and lead to the understanding of disease and drug action.ResultsWe present a detailed description of Reverse Causal Reasoning (RCR), a reverse engineering methodology to infer mechanistic hypotheses from molecular profiling data. This methodology requires prior knowledge in the form of small networks that causally link a key upstream controller node representing a biological mechanism to downstream measurable quantities. These small directed networks are generated from a knowledge base of literature-curated qualitative biological cause-and-effect relationships expressed as a network. The small mechanism networks are evaluated as hypotheses to explain observed differential measurements. We provide a simple implementation of this methodology, Whistle, specifically geared towards the analysis of gene expression data and using prior knowledge expressed in Biological Expression Language (BEL). We present the Whistle analyses for three transcriptomic data sets using a publically available knowledge base. The mechanisms inferred by Whistle are consistent with the expected biology for each data set.ConclusionsReverse Causal Reasoning yields mechanistic insights to the interpretation of gene expression profiling data that are distinct from and complementary to the results of analyses using ontology or pathway gene sets. This reverse engineering algorithm provides an evidence-driven approach to the development of models of disease, drug action, and drug toxicity.

Highlights

  • Gene expression profiling and other genome-scale measurement technologies provide comprehensive information about molecular changes resulting from a chemical or genetic perturbation, or disease state

  • Example data sets To highlight the utility of Reverse Causal Reasoning (RCR) for generating testable mechanistic hypotheses from gene expression profiling data, we provide examples of the application of Whistle v1.0 to three published gene expression data sets using a network compiled from the Biological Expression Language (BEL) Large Corpus as the prior knowledge source

  • The use of a knowledge base structured as a directed, causal network provides RCR with some key advantages over other analysis techniques relying on prior knowledge: (1) RCR does not rely on the assumption that changes in RNA expression are equivalent to changes in the activity of the corresponding protein, (2) the structuring of gene sets as networks (‘HYPs’) allows evaluation of genes up- and down-regulated by the same mechanism as a cohesive, causally consistent mechanism, and (3) flexibility to generate HYP networks for evaluation from the knowledge base network, potentially combining related upstream nodes to a single HYP or dividing HYPs based on knowledge context

Read more

Summary

Introduction

Gene expression profiling and other genome-scale measurement technologies provide comprehensive information about molecular changes resulting from a chemical or genetic perturbation, or disease state. A critical challenge is the development of methods to interpret these large-scale data sets to identify specific biological mechanisms that can provide experimentally verifiable hypotheses and lead to the understanding of disease and drug action. Molecular profiling technologies have enabled the collection of large, exploratory data sets consisting of measurements for tens of thousands of molecular entities. These rich data sets hold promise for understanding the molecular bases of disease, drug action, and drug toxicity, but do not often lead to a reasonable short list of potential molecular mechanisms that can be investigated further by targeted experiments. Genes can be grouped into sets based on a variety of criteria including: (1) functional annotation, (2) pathway maps, (3) regulatory or structural motifs, and (4) common response to an experimental perturbation

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call