Abstract

BackgroundDe novo eukaryotic promoter prediction is important for discovering novel genes and understanding gene regulation. In spite of the great advances made in the past decade, recent studies revealed that the overall performances of the current promoter prediction programs (PPPs) are still poor, and predictions made by individual PPPs do not overlap each other. Furthermore, most PPPs are trained and tested on the most-upstream promoters; their performances on alternative promoters have not been assessed.ResultsIn this paper, we evaluate the performances of current major promoter prediction programs (i.e., PSPA, FirstEF, McPromoter, DragonGSF, DragonPF, and FProm) using 42,536 distinct human gene promoters on a genome-wide scale, and with emphasis on alternative promoters. We describe an artificial neural network (ANN) based meta-predictor program that integrates predictions from the current PPPs and the predicted promoters' relation to CpG islands. Our specific analysis of recently discovered alternative promoters reveals that although only 41% of the 3' most promoters overlap a CpG island, 74% of 5' most promoters overlap a CpG island.ConclusionOur assessment of six PPPs on 1.06 × 109 bps of human genome sequence reveals the specific strengths and weaknesses of individual PPPs. Our meta-predictor outperforms any individual PPP in sensitivity and specificity. Furthermore, we discovered that the 5' alternative promoters are more likely to be associated with a CpG island.

Highlights

  • De novo eukaryotic promoter prediction is important for discovering novel genes and understanding gene regulation

  • Both DBTSS 5' and RefSeq annotated Transcription Start Site (TSS) were upstream of the coding sequence (CDS), and about 67% were within 1 kb upstream

  • This study differs from our study in the following: 1) Similar to other state-of-the-art promoter prediction programs (PPPs), this study focuses on most upstream TSS (MUTSS) prediction, whereas our study focuses on alternative transcription start sites (ATSS) prediction; 2) the assessment of the pro

Read more

Summary

Introduction

De novo eukaryotic promoter prediction is important for discovering novel genes and understanding gene regulation. Initiation of transcription is regulated by a coordinated binding of many transcription factors to the core promoter region. The initiation process is further modulated by binding of activators and repressors in more distal regions [1,2]. The core promoter is the region (usually ± 50 bps) around the transcription start site (TSS), which is vital for initiation of the basal transcription. BMC Genomics 2007, 8:374 http://www.biomedcentral.com/1471-2164/8/374 moter contains several transcription factor binding sites that facilitate transcription initiation, such as the TATA box, the GC box, Inr [1,3], and the recently discovered MTE [4] and DPE [5].

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call