Abstract

BackgroundCurrent technologies have lead to the availability of multiple genomic data types in sufficient quantity and quality to serve as a basis for automatic global network inference. Accordingly, there are currently a large variety of network inference methods that learn regulatory networks to varying degrees of detail. These methods have different strengths and weaknesses and thus can be complementary. However, combining different methods in a mutually reinforcing manner remains a challenge.MethodologyWe investigate how three scalable methods can be combined into a useful network inference pipeline. The first is a novel t-test–based method that relies on a comprehensive steady-state knock-out dataset to rank regulatory interactions. The remaining two are previously published mutual information and ordinary differential equation based methods (tlCLR and Inferelator 1.0, respectively) that use both time-series and steady-state data to rank regulatory interactions; the latter has the added advantage of also inferring dynamic models of gene regulation which can be used to predict the system's response to new perturbations.Conclusion/SignificanceOur t-test based method proved powerful at ranking regulatory interactions, tying for first out of methods in the DREAM4 100-gene in-silico network inference challenge. We demonstrate complementarity between this method and the two methods that take advantage of time-series data by combining the three into a pipeline whose ability to rank regulatory interactions is markedly improved compared to either method alone. Moreover, the pipeline is able to accurately predict the response of the system to new conditions (in this case new double knock-out genetic perturbations). Our evaluation of the performance of multiple methods for network inference suggests avenues for future methods development and provides simple considerations for genomic experimental design. Our code is publicly available at http://err.bio.nyu.edu/inferelator/.

Highlights

  • Predicting how a cell will respond, at the molecular level, to environmental and genetic perturbations is a key problem in systems biology

  • The main challenge in the DREAM4 100 gene in-silico regulatory network competition was to predict the topology of five networks

  • We evaluated the performance of four pipelines for learning regulatory networks, namely: Median corrected zscore (MCZ), tlCLRInferelator, time-lagged Context Likelihood of Relatedness (tlCLR)-Inferelator+MCZ, and Resampling+MCZ

Read more

Summary

Introduction

Predicting how a cell will respond, at the molecular level, to environmental and genetic perturbations is a key problem in systems biology. Methods that learn less detailed regulatory models scale to larger systems and data sizes than methods that learn more complex models Another critical difference between methods is whether causal (directed) edges or undirected relationships are learned. A computational biologist should choose the most detailed method that the data will support, as more detailed models can suggest more focused biological hypothesis and be used to model a system’s behavior in ways that simple network models cannot Given this constant need to balance the specific features of any given biological dataset with the capabilities of multiple RN inference algorithms, testing of RN inference methods using a variety of datasets is a critical field-wide activity. There are currently a large variety of network inference methods that learn regulatory networks to varying degrees of detail These methods have different strengths and weaknesses and can be complementary. Combining different methods in a mutually reinforcing manner remains a challenge

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call