Abstract
BackgroundCis-regulatory modules are combinations of regulatory elements occurring in close proximity to each other that control the spatial and temporal expression of genes. The ability to identify them in a genome-wide manner depends on the availability of accurate models and of search methods able to detect putative regulatory elements with enhanced sensitivity and specificity.ResultsWe describe the implementation of a search method for putative transcription factor binding sites (TFBSs) based on hidden Markov models built from alignments of known sites. We built 1,079 models of TFBSs using experimentally determined sequence alignments of sites provided by the TRANSFAC and JASPAR databases and used them to scan sequences of the human, mouse, fly, worm and yeast genomes. In several cases tested the method identified correctly experimentally characterized sites, with better specificity and sensitivity than other similar computational methods. Moreover, a large-scale comparison using synthetic data showed that in the majority of cases our method performed significantly better than a nucleotide weight matrix-based method.ConclusionThe search engine, available at , allows the identification, visualization and selection of putative TFBSs occurring in the promoter or other regions of a gene from the human, mouse, fly, worm and yeast genomes. In addition it allows the user to upload a sequence to query and to build a model by supplying a multiple sequence alignment of binding sites for a transcription factor of interest. Due to its extensive database of models, powerful search engine and flexible interface, MAPPER represents an effective resource for the large-scale computational analysis of transcriptional regulation.
Highlights
Cis-regulatory modules are combinations of regulatory elements occurring in close proximity to each other that control the spatial and temporal expression of genes
TRANSFAC provides two sources of information regarding the binding sites for transcription factor (TF): nucleotide sequences of binding sites referenced in the description of the TRANSFAC matrices that were optimally aligned and used to derive nucleotide weight matrix (NWM), and nucleotide sequences of binding sites referenced as part of the description of the TFs – referred to as "factors", used to extract alignments designated below as factor-derived alignments and catalogued with accession numbers starting with "T"
The purpose of our work was to establish a methodology for the detection of transcription factor binding site (TFBS) in multiple genomes endowed with enough sensitivity and specificity to be effective in large-scale analysis
Summary
Cis-regulatory modules are combinations of regulatory elements occurring in close proximity to each other that control the spatial and temporal expression of genes. Transcriptional regulation is accomplished by the coordinated activity of specific regulatory proteins that recognize and bind regulatory elements – short DNA motifs located in the untranscribed regions of the genes [1]. Combinations of regulatory elements that occur in close proximity to each other form cis-regulatory modules that control gene expression Their presence suggests the existence of a combinatorial code for transcriptional regulation [25], with ample effort being devoted to developing algorithms for its elucidation [26,27,28,29,30,31,32,33,34]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.