Abstract

BackgroundTranscription initiation regulation is mediated by sequence-specific interactions between DNA-binding proteins (transcription factors) and cis-elements, where BRE, TATA, INR, DPE and MTE motifs constitute canonical core motifs for basal transcription initiation of genes. Accurate identification of transcription start site (TSS) and their corresponding promoter regions is critical for delineation of these motifs. To this end, the genome scale analysis of core promoter architecture in insects has been confined to Drosophila. The recently sequenced Tsetse fly genome provides a unique opportunity to analyze transcription initiation regulation machinery in blood-feeding insects.ResultsA computational method for identification of TSS in newly sequenced Tsetse fly genome was evaluated, using TSS seq tags sampled from two developmental stages namely; larvae and pupae. There were 3134 tag clusters among which 45.4 % (1424) of the tag clusters mapped to first coding exons or their proximal predicted 5′UTR regions and 1.0 % (31) tag clusters mapping to transposons, within a threshold of 100 tags per cluster. These 1393 non transposon-derived core promoters had propensity for AT nucleotides. The −1/+1 and 1/+1 positions in D. melanogaster, and G. m. morsitans had propensity for CA and AA dinucleotides respectively. The 1393 tag clusters comprised narrow promoters (5 %), broad with peak promoters (23 %) and broad without peak promoters (72 %). Two-way motif co-occurrence analysis showed that the MTE-DPE pair is over-represented in broad core promoters. The frequently occurring triplet motifs in all promoter classes are the INR-MTE-DPE, TATA-MTE-DPE and TATA-INR-DPE. Promoters without the TATA motif had higher frequency of the MTE and INR motifs than those observed in Drosophila, where the DPE motif occur more frequently in promoters without TATA motif. Gene ontology terms associated with developmental processes were overrepresented in the narrow and broad with peak promoters.ConclusionsThe study has identified different motif combinations associated with broad promoters in a blood-feeding insect. In the case of TATA-less core promoters, G.m. morsitans uses the MTE to compensate for the lack of a TATA motif. The increasing availability of TSS seq data allows for revision of existing gene annotation datasets with the potential of identifying new transcriptional units.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1921-6) contains supplementary material, which is available to authorized users.

Highlights

  • Transcription initiation regulation is mediated by sequence-specific interactions between Deoxyribonucleic acid (DNA)-binding proteins and cis-elements, where BRE, TATA, INR, downstream promoter element (DPE) and motif ten element (MTE) motifs constitute canonical core motifs for basal transcription initiation of genes

  • The core promoter architecture has been elucidated for human and mouse [5] and the fruit fly [6,7,8,9,10] among which canonical core promoter motifs are conserved [2, 4, 11]

  • Genome mapping and transcription start site (TSS) seq clustering The current assembly of G. m. morsitans genome consists of 13,807 scaffolds of which 3058 contain at least one gene [1]

Read more

Summary

Introduction

Transcription initiation regulation is mediated by sequence-specific interactions between DNA-binding proteins (transcription factors) and cis-elements, where BRE, TATA, INR, DPE and MTE motifs constitute canonical core motifs for basal transcription initiation of genes. Accurate identification of transcription start site (TSS) and their corresponding promoter regions is critical for delineation of these motifs. To this end, the genome scale analysis of core promoter architecture in insects has been confined to Drosophila. The core promoter constitutes the minimal portion of the promoter required to properly initiate transcription It encompasses transcription start site (TSS) extending either upstream or downstream for ~50 bases [2,3,4]. Canonical core promoter motifs include TATA, the initiator (INR), TFIIB recognition element (BRE) and downstream promoter element (DPE) motifs

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call