Abstract
BackgroundFunctional annotation of bacterial genomes is an obligatory and crucially important step of information processing from the genome sequences into cellular mechanisms. However, there is a lack of computational methods to evaluate the quality of functional assignments.ResultsWe developed a genome-scale model that assigns Bayesian probability to each gene utilizing a known property of functional similarity between neighboring genes in bacteria.ConclusionsOur model clearly distinguished true annotation from random annotation with Bayesian annotation probability >0.95. Our model will provide a useful guide to quantitatively evaluate functional annotation methods and to detect gene sets with reliable annotations.
Highlights
Functional annotation of bacterial genomes is an obligatory and crucially important step of information processing from the genome sequences into cellular mechanisms
Probability of annotation confidence We applied our methodology to Escherichia coli and Clostridium thermocellum to calculate the probability of annotation confidence (PAC) for NCBI annotation and compared it with “random” annotation
The NCBI annotations with lower PAC values may come from an insufficient number of detectable function similarities with genes in the neighborhood that were derived from the uncovered knowledge of Gene Ontology (GO) annotation and graph structure
Summary
Functional annotation of bacterial genomes is an obligatory and crucially important step of information processing from the genome sequences into cellular mechanisms. There is a lack of computational methods to evaluate the quality of functional assignments. Results: We developed a genome-scale model that assigns Bayesian probability to each gene utilizing a known property of functional similarity between neighboring genes in bacteria. Our model will provide a useful guide to quantitatively evaluate functional annotation methods and to detect gene sets with reliable annotations. Functional annotation of bacterial genomes is an obligatory and crucially important step of information processing from the genome sequences toward insights into cellular mechanisms, putative ecological roles, or predictive models of a given organism or microbial community. Databases, platforms, and score filters involve computational pipelines that assign functions to the genes [4]. The function of genes is central for all biological insights, including interpretation and design of experiments and comparative genomic analysis, as well as the
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have