Simultaneous prediction of transcription factor binding sites in a group of prokaryotic genomes

Shaoqiang Zhang,Shan Li,Zhengchang Su,Phuc T Pham

doi:10.1186/1471-2105-11-397

Abstract

BackgroundOur current understanding of transcription factor binding sites (TFBSs) in sequenced prokaryotic genomes is very limited due to the lack of an accurate and efficient computational method for the prediction of TFBSs at a genome scale. In an attempt to change this situation, we have recently developed a comparative genomics based algorithm called GLECLUBS for de novo genome-wide prediction of TFBSs in a target genome. Although GLECLUBS has achieved rather high prediction accuracy of TFBSs in a target genome, it is still not efficient enough to be applied to all the sequenced prokaryotic genomes.ResultsHere, we designed a new algorithm based on GLECLUBS called extended GLECLUBS (eGLECLUBS) for simultaneous prediction of TFBSs in a group of related prokaryotic genomes. When tested on a group of γ-proteobacterial genomes including E. coli K12, a group of firmicutes genomes including B. subtilis and a group of cyanobacterial genomes using the same parameter settings, eGLECLUBS predicts more than 82% of known TFBSs in extracted inter-operonic sequences in both E. coli K12 and B. subtilis. Because each genome in a group is equally treated, it is highly likely that similar prediction accuracy has been achieved for each genome in the group.ConclusionsWe have developed a new algorithm for genome-wide de novo prediction of TFBSs in a group of related prokaryotic genomes. The algorithm has achieved the same level of accuracy and robustness as its predecessor GLECLUBS, but can work on dozens of genomes at the same time.

Highlights

Our current understanding of transcription factor binding sites (TFBSs) in sequenced prokaryotic genomes is very limited due to the lack of an accurate and efficient computational method for the prediction of TFBSs at a genome scale
TFBSs can be effectively identified by phylogenetic footprinting based on predicted Cluster of Operons with Orthologous Relationships (COOR) using multiple motif-finding tools In a typical phylogenetic footprinting procedure with a single target genome, upstream intergenic sequences are extracted based on a group of orthologous genes of a gene in the target genome [3,4,5,6,7,8,9,10]
Application of the algorithm (Figure 1) to a group of target genomes comprised of 32 g-proteobacterial genomes including E. coli K12 [Additional file 1: group D in Supplemental Figure S1] resulted in 4,103 COORs and interoperonic sequences sets which contain 1,447 known E. coli K12 TFBSs as described above

Summary

Introduction

Our current understanding of transcription factor binding sites (TFBSs) in sequenced prokaryotic genomes is very limited due to the lack of an accurate and efficient computational method for the prediction of TFBSs at a genome scale. TFBSs are usually predicted by comparative analysis of multiple sequences that are known to contain or potentially contain TFBSs. Based on the observation that the transcriptional regulation machinery including TFBSs is relatively conserved in closely related genomes, various forms of phylogenetic footprinting algorithms have been developed to identify conserved DNA segments as possible TFBSs in the promoters of orthologous genes in a group of related prokaryotic [3,4,5,6,7,8,9] and fungal genomes [10]. For the convenience of discussion, in this paper, we refer a set of similar TFBSs as a motif

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jul 23, 2010
Citations: 53	License type: cc-by

R Discovery Prime

R Discovery Prime

Simultaneous prediction of transcription factor binding sites in a group of prokaryotic genomes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

MaxATAC: Genome-scale transcription-factor binding prediction from ATAC-seq with deep neural networks.
Tareian A Cazares ... Teresa M Przytycka
PLOS Computational Biology | VOL. 19
Tareian A Cazares, et. al.Tareian A Cazares ... Teresa M Przytycka
31 Jan 2023
PLOS Computational Biology | VOL. 19

Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
Mohammad Talebzadeh ... Fatemeh Zare-Mirakabad
PLoS ONE | VOL. 9
Mohammad Talebzadeh, et. al.Mohammad Talebzadeh ... Fatemeh Zare-Mirakabad
21 Feb 2014
PLoS ONE | VOL. 9

Prediction and Analysis of Gene Regulatory Networks in Prokaryotic Genomes
Richard Munch ... Johannes Klein
-
Richard Munch, et. al.Richard Munch ... Johannes Klein
15 Sep 2011
15 Sep 2011

Evaluierung des phylogenetischen Footprintings und dessen Anwendung zur verbesserten Vorhersage von Transkriptionsfaktor-Bindestellen
Tilman Sauer
-
Tilman SauerTilman Sauer
20 Feb 2022
20 Feb 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Simultaneous prediction of transcription factor binding sites in a group of prokaryotic genomes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics