Abstract

BackgroundSequence features in promoter regions are involved in regulating gene transcription initiation. Although numerous computational methods have been developed for predicting transcriptional start sites (TSSs) or transcription factor (TF) binding sites (TFBSs), they lack annotations for do not consider some important regulatory features such as CpG islands, tandem repeats, the TATA box, CCAAT box, GC box, over-represented oligonucleotides, DNA stability, and GC content. Additionally, the combinatorial interaction of TFs regulates the gene group that is associated with same expression pattern. To investigate gene transcriptional regulation, an integrated system that annotates regulatory features in a promoter sequence and detects co-regulation of TFs in a group of genes is needed.ResultsThis work identifies TSSs and regulatory features in a promoter sequence, and recognizes co-occurrence of cis-regulatory elements in co-expressed genes using a novel system. Three well-known TSS prediction tools are incorporated with orthologous conserved features, such as CpG islands, nucleotide composition, over-represented hexamer nucleotides, and DNA stability, to construct the novel Gene Promoter Miner (GPMiner) using a support vector machine (SVM). According to five-fold cross-validation results, the predictive sensitivity and specificity are both roughly 80%. The proposed system allows users to input a group of gene names/symbols, enabling the co-occurrence of TFBSs to be determined. Additionally, an input sequence can also be analyzed for homogeneity of experimental mammalian promoter sequences, and conserved regulatory features between homologous promoters can be observed through cross-species analysis. After identifying promoter regions, regulatory features are visualized graphically to facilitate gene promoter observations.ConclusionsThe GPMiner, which has a user-friendly input/output interface, has numerous benefits in analyzing human and mouse promoters. The proposed system is freely available at http://GPMiner.mbc.nctu.edu.tw/.

Highlights

  • Sequence features in promoter regions are involved in regulating gene transcription initiation

  • Gene transcription is regulated by transcription factors (TFs) that bind to promoter regions; which is the crucial control region for transcriptional activation of all genes [1]

  • A typical promoter sequence, which is located near the transcriptional start site (TSS), is

Read more

Summary

Introduction

Sequence features in promoter regions are involved in regulating gene transcription initiation. The combinatorial interaction of TFs regulates the gene group that is associated with same expression pattern. Some co-regulatory networks describe the set of all significant associations among TFs in regulating common target genes [5]. Veerla et al recently developed SMART software for identifying co-occurring TFBSs in gene set promoters [7]. This software does not have a user-friendly interface for identifying TSSs with regulatory elements and efficiently analyzing combinatorial TFBSs of a group of promoters. TOUCAN is a Java application for identifying significant cis-regulatory elements from sets of coexpressed genes, TOUCAN ignores combinatorial TFBSs analysis [9]. This work develops a novel system, Gene Promoter Miner (GPMiner), for identifying cooccurring TFBSs in a group of gene promoters

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call