Abstract

Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria.

Highlights

  • Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons

  • We evaluated and refined our methods based on the 177 documented E. coli regulons from RegulonDB and its genome-scale microarray gene expression data collected under 466 conditions[7]

  • While numerous regulons have been experimentally identified in a few model organisms including E. coli K12, the full elucidation of all the regulons encoded in this or any bacterial genome may have to rely heavily on computational approaches

Read more

Summary

Introduction

Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. A successful elucidation of regulons will substantially improve the identification of transcriptionally co-regulated genes encoded in a bacteria genome, realistically allowing reliable prediction of global transcription regulation networks. The phylogenetic footprinting technology is still not well defined in the selection of reference genomes and measuring the evolutionary distance between any pair of genomes, limiting the usage efficacy on motif finding[23,24]; (ii) there is a lack of a reliable measurement for motif similarity, and current motif comparison using aligned motif profiles usually produces too many false positives[26]; (iii) better operon prediction algorithms are missing, especially utilizing the high-throughput RNA-sequencing data[24,27]; and (iv) current regulon prediction methods usually cluster the motif signals directly, leads to unreliable predictions due to randomly matching between motifs[23]. A more ingenious design is required in the step of clustering

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call