Abstract

Operon prediction is a valuable component of microbial-genome annotation because operon organization can yield inferences about gene function, and because knowledge of operon structure can aid the interpretation of gene expression data. We present a number of improvements to the existing Pathway Tools operon predictor based mostly on 7 new features that we hypothesized would increase its performance. The new features include shared Gene Ontology biological process terms, similarity of codon usage and GC content, correlated gene expression, and shared protein complex. We evaluated the proposed 7 new features and found that the addition of 6 of them improved the performance of the operon predictor from 79.55% to 83.49%, a decrease in error rate of 19.3%. When gene expression data was not included, the accuracy decreased to 82.547, still an improvement of 14.7%. One of the proposed features as well as a previously used feature had no effect and were removed. Although some of the new features had strong predictive value individually, when combined with the other features they did not have a large impact on predictive accuracy, suggesting that they were not independent from the other features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call