Abstract
BackgroundAutomated function prediction has played a central role in determining the biological functions of bacterial proteins. Typically, protein function annotation relies on homology, and function is inferred from other proteins with similar sequences. This approach has become popular in bacterial genomics because it is one of the few methods that is practical for large datasets and because it does not require additional functional genomics experiments. However, the existing solutions produce erroneous predictions in many cases, especially when query sequences have low levels of identity with the annotated source protein. This problem has created a pressing need for improvements in homology-based annotation.ResultsWe present an automated method for the functional annotation of bacterial protein sequences. Based on sequence similarity searches, BLANNOTATOR accurately annotates query sequences with one-line summary descriptions of protein function. It groups sequences identified by BLAST into subsets according to their annotation and bases its prediction on a set of sequences with consistent functional information. We show the results of BLANNOTATOR's performance in sets of bacterial proteins with known functions. We simulated the annotation process for 3090 SWISS-PROT proteins using a database in its state preceding the functional characterisation of the query protein. For this dataset, our method outperformed the five others that we tested, and the improved performance was maintained even in the absence of highly related sequence hits. We further demonstrate the value of our tool by analysing the putative proteome of Lactobacillus crispatus strain ST1.ConclusionsBLANNOTATOR is an accurate method for bacterial protein function prediction. It is practical for genome-scale data and does not require pre-existing sequence clustering; thus, this method suits the needs of bacterial genome and metagenome researchers. The method and a web-server are available at http://ekhidna.biocenter.helsinki.fi/poxo/blannotator/.
Highlights
Automated function prediction has played a central role in determining the biological functions of bacterial proteins
Many automated protein function prediction methods describe the biological role of a gene product in terms of single-line description of protein function (DE) or gene ontology (GO)
We present a computational method for protein function prediction that relies on the concept of homology to annotate a query sequence with one-line summary descriptions of protein function
Summary
Automated function prediction has played a central role in determining the biological functions of bacterial proteins. Protein function annotation relies on homology, and function is inferred from other proteins with similar sequences This approach has become popular in bacterial genomics because it is one of the few methods that is practical for large datasets and because it does not require additional functional genomics experiments. Several valuable tools have been developed for the prediction of DE or GO annotations, but the most popular tools are based on the concept of homology [5,6,7] The premise of this technique is that the functional properties of related sequences are conserved during evolution and that the function of the query protein can be inferred from that of other proteins with similar sequences. It has been suggested that at least 40-60% identity is, for example, needed to accurately infer enzymatic function [5,6,19]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.