Abstract

BackgroundComplex gene regulatory networks underlie many cellular and developmental processes. While a variety of experimental approaches can be used to discover how genes interact, few biological systems have been systematically evaluated to the extent required for an experimental definition of the underlying network. Therefore, the development of computational methods that can use limited experimental data to define and model a gene regulatory network would provide a useful tool to evaluate many important but incompletely understood biological processes. Such methods can assist in extracting all relevant information from data that are available, identify unexpected regulatory relationships and prioritize future experiments.ResultsTo facilitate the analysis of gene regulatory networks, we have developed a computational modeling pipeline method that complements traditional evaluation of experimental data. For a proof-of-concept example, we have focused on the gene regulatory network in the nematode C. elegans that mediates the developmental choice between mesodermal (muscle) and ectodermal (skin) cell fates in the embryonic C lineage. We have used gene expression data to build two models: a knowledge-driven model based on gene expression changes following gene perturbation experiments, and a data-driven mathematical model derived from time-course gene expression data recovered from wild-type animals. We show that both models can identify a rich set of network gene interactions. Importantly, the mathematical model built only from wild-type data can predict interactions demonstrated by the perturbation experiments better than chance, and better than an existing knowledge-driven model built from the same data set. The mathematical model also provides new biological insight, including a dissection of zygotic from maternal functions of a key transcriptional regulator, PAL-1, and identification of non-redundant activities of the T-box genes tbx-8 and tbx-9.ConclusionsThis work provides a strong example for a mathematical modeling approach that solely uses wild-type data to predict an underlying gene regulatory network. The modeling approach complements traditional methods of data analysis, suggesting non-intuitive network relationships and guiding future experiments.

Highlights

  • Complex gene regulatory networks underlie many cellular and developmental processes

  • The source data set for the Mathematically Inferred Model (MIM) is provided in [13]

  • The primary source of data for our gold standard network, which we label Gold Standard Network (GSN), regulating cell fate decisions in the C. elegans C lineage is [14], with additional data curated from the literature (Additional file 1: Table S1, tabs “Gene interactions” and “Gene interactions – refs”)

Read more

Summary

Introduction

Complex gene regulatory networks underlie many cellular and developmental processes. While a variety of experimental approaches can be used to discover how genes interact, few biological systems have been systematically evaluated to the extent required for an experimental definition of the underlying network. The development of computational methods that can use limited experimental data to define and model a gene regulatory network would provide a useful tool to evaluate many important but incompletely understood biological processes Such methods can assist in extracting all relevant information from data that are available, identify unexpected regulatory relationships and prioritize future experiments. While a variety of genetic and genomic methods can be used to dissect the regulatory inputs into developmental cell fate decisions, large scale experimental analyses are limited by time and expense These practical constraints argue for the development of computational methods that maximize extraction of biologically relevant information from the available data, as well as the development of predictive models to prioritize experiments for future testing. Data-driven models can always be built given an appropriate data set; the problem lies in the fact that there are typically hundreds to thousands of possible models for a given data set and so model selection techniques must be employed

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call