Abstract

The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.

Highlights

  • The regulation of gene transcription plays a key role in cell functioning [1]

  • The study of transcription regulation based on the transcription factor (TF) binding data is one of the most developed bioinformatics fields

  • Does multiple transcription factors (TF) binding in a gene region determine organism development program, or it is just random effects not related to gene function [4]? Statistical estimates of multiple TFs binding may highlight fundamental gene expression features common for any eukaryotic genome [5]

Read more

Summary

Introduction

The regulation of gene transcription plays a key role in cell functioning [1]. The study of transcription regulation based on the transcription factor (TF) binding data is one of the most developed bioinformatics fields. New problems arise for combinatorial gene expression regulation based on multiple TFs binding. Does multiple TF binding in a gene region determine organism development program, or it is just random effects not related to gene function [4]? Statistical estimates of multiple TFs binding may highlight fundamental gene expression features common for any eukaryotic genome [5]. Linking the gene promoter regions and highthroughput TF binding data we may reconstruct regulatory gene networks in the genome scale [1]. The location of a TF binding site in genomic sequence may be remote from the gene transcription start, preventing reliable TF target gene prediction. Despite the development of data sophisticated analysis techniques, the prediction of the regulatory regions of transcription, based only on the nucleotide sequence or single binding events remains a complex problem

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call