Abstract

BackgroundThe comprehension of the gene regulatory code in eukaryotes is one of the major challenges of systems biology, and is a requirement for the development of novel therapeutic strategies for multifactorial diseases. Its bi-fold degeneration precludes brute force and statistical approaches based on the genomic sequence alone. Rather, recursive integration of systematic, whole-genome experimental data with advanced statistical regulatory sequence predictions needs to be developed. Such experimental approaches as well as the prediction tools are only starting to become available and increasing numbers of genome sequences and empirical sequence annotations are under continual discovery-driven change. Furthermore, given the complexity of the question, a decade(s) long multi-laboratory effort needs to be envisioned. These constraints need to be considered in the creation of a framework that can pave a road to successful comprehension of the gene regulatory code.ResultsWe introduce here a concept for such a framework, based entirely on systematic annotation in terms of probability profiles of genomic sequence using any type of relevant experimental and theoretical information and subsequent cross-correlation analysis in hypothesis-driven model building and testing.ConclusionProbability landscapes, which include as reference set the probabilistic representation of the genomic sequence, can be used efficiently to discover and analyze correlations amongst initially heterogeneous and un-relatable descriptions and genome-wide measurements. Furthermore, this structure is usable as a support for automatically generating and testing hypotheses for alternative gene regulatory grammars and the evaluation of those through statistical analysis of the high-dimensional correlations between genomic sequence, sequence annotations, and experimental data. Finally, this structure provides a concrete and tangible basis for attempting to formulate a mathematical description of gene regulation in eukaryotes on a genome-wide scale.

Highlights

  • The comprehension of the gene regulatory code in eukaryotes is one of the major challenges of systems biology, and is a requirement for the development of novel therapeutic strategies for multifactorial diseases

  • The currently used format for representing genomic sequences is a letter code that mostly does not indicate of the location of gaps

  • We show that the genomic sequence itself, its annotation with empirically derived features, http://www.tbiomed.com/content/5/1/9

Read more

Summary

Introduction

The comprehension of the gene regulatory code in eukaryotes is one of the major challenges of systems biology, and is a requirement for the development of novel therapeutic strategies for multifactorial diseases. Recursive integration of systematic, whole-genome experimental data with advanced statistical regulatory sequence predictions needs to be developed. The resulting cell and cell-state specific transcriptome profiles result from a combination of tightly controlled regulatory events in response to intra-, extra-, and inter-cellular signals [2] These transcription (page number not for citation purposes). Today only about three hundred transcription factors with an average of about twenty regulatory sequence elements have been well characterized experimentally for e.g. the human genome [13]. That the human genome encodes some 3,000 sequence specific transcription factors and at least 100,000 regulatory elements [2,12,13] Despite this enormous discrepancy, five fundamental properties of gene regulatory coding have been established [1]. On the basis of these observations a histone- or chromatin-code hypothesis has been developed that places chromatin at the heart of gene regulatory control [1,2,15]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.