Abstract

Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the computational prediction of CPEs across eight plant genomes to help better understand the transcription initiation complex assembly. The distribution of thirteen known CPEs across four monocots (Brachypodium distachyon, Oryza sativa ssp. japonica, Sorghum bicolor, Zea mays) and four dicots (Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera, Glycine max) reveals the structural organization of the core promoter in relation to the TATA-box as well as with respect to other CPEs. The distribution of known CPE motifs with respect to transcription start site (TSS) exhibited positional conservation within monocots and dicots with slight differences across all eight genomes. Further, a more refined subset of annotated genes based on orthologs of the model monocot (O. sativa ssp. japonica) and dicot (A. thaliana) genomes supported the positional distribution of these thirteen known CPEs. DNA free energy profiles provided evidence that the structural properties of promoter regions are distinctly different from that of the non-regulatory genome sequence. It also showed that monocot core promoters have lower DNA free energy than dicot core promoters. The comparison of monocot and dicot promoter sequences highlights both the similarities and differences in the core promoter architecture irrespective of the species-specific nucleotide bias. This study will be useful for future work related to genome annotation projects and can inspire research efforts aimed to better understand regulatory mechanisms of transcription.

Highlights

  • Despite numerous technological advances in biological and computational sciences in the post- genome era, our basic understanding of gene regulatory mechanisms remains primitive

  • Only the transcripts annotated with a 59 untranslated region (59UTR) and high quality filtered gene-set were used for core promoter elements (CPEs) predictions (Table S1)

  • Identification and characterization of the core promoter binding site motifs of the transcription factors participating in the formation of pre-initiation complex (PIC) complex will help us understand the core promoter architecture and establish the processes by which plant basal transcription machinery functions

Read more

Summary

Introduction

Despite numerous technological advances in biological and computational sciences in the post- genome era, our basic understanding of gene regulatory mechanisms remains primitive. The fundamental need to understand RNA polymerase II (polII) mediated transcription initiation is well recognized for developing system level understanding of the condition-specific gene regulatory networks (GRNs). It is still challenging to accurately identify the transcription start site (TSS) and predict the functional genomic elements in the promoter region. Incorporation of TSS and cisregulatory element identification tools into genome annotation pipelines has yet to become a common practice. While experimental approaches like yeast-1-hybrid (Y1H) [4] and chromatinimmunoprecipitation (ChIP) assays [5] have made great strides in identifying transcription factor binding sites (TFBS) for a few model organisms, there are still technical and cost barriers to implement these methods on a large scale. There is a need for robust bioinformatics methods that can accurately identify TSS and predict the TFBS for the plant genomes. Reliable prediction of core promoter elements holds the promise to bridge this gap

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call