Abstract
Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria.
Highlights
Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons
We evaluated and refined our methods based on the 177 documented E. coli regulons from RegulonDB and its genome-scale microarray gene expression data collected under 466 conditions[7]
While numerous regulons have been experimentally identified in a few model organisms including E. coli K12, the full elucidation of all the regulons encoded in this or any bacterial genome may have to rely heavily on computational approaches
Summary
Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. A successful elucidation of regulons will substantially improve the identification of transcriptionally co-regulated genes encoded in a bacteria genome, realistically allowing reliable prediction of global transcription regulation networks. The phylogenetic footprinting technology is still not well defined in the selection of reference genomes and measuring the evolutionary distance between any pair of genomes, limiting the usage efficacy on motif finding[23,24]; (ii) there is a lack of a reliable measurement for motif similarity, and current motif comparison using aligned motif profiles usually produces too many false positives[26]; (iii) better operon prediction algorithms are missing, especially utilizing the high-throughput RNA-sequencing data[24,27]; and (iv) current regulon prediction methods usually cluster the motif signals directly, leads to unreliable predictions due to randomly matching between motifs[23]. A more ingenious design is required in the step of clustering
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.