Abstract
We assess the variability of protein function in protein sequence and structure space. Various regions in this space exhibit considerable difference in the local conservation of molecular function. We analyze and capture local function conservation by means of logistic curves. Based on this analysis, we propose a method for predicting molecular function of a query protein with known structure but unknown function. The prediction method is rigorously assessed and compared with a previously published function predictor. Furthermore, we apply the method to 500 functionally unannotated PDB structures and discuss selected examples. The proposed approach provides a simple yet consistent statistical model for the complex relations between protein sequence, structure, and function. The GOdot method is available online (http://godot.bioinf.mpi-inf.mpg.de).
Highlights
Protein structure databases are growing at a rapid rate and, in recent years, structural genomics initiatives have increased the growth rate further
We present a method for protein function prediction based on a novel concept, called local function conservation
Protein sequence and structure information of an unannotated protein are used as input to GOdot, which predicts a list of Gene Ontology (GO) terms
Summary
Protein structure databases are growing at a rapid rate and, in recent years, structural genomics initiatives have increased the growth rate further. Some function prediction methods transfer function from similar sequences, such as GOtcha [7], Blast2GO [8], or PFP [9]. Phylogenomic methods, such as SIFTER [10] and Orthostrapper [11], consider knowledge on the evolution of homologous proteins. The underlying idea of similarity based function transfer is that proteins with similar sequence and structural features are likely to perform the same function [27,28,29] We take this principle one step further by examining groups of similar proteins. We estimate the rate of errors made when inferring protein function annotations based on protein sequence and structure similarity. Within the space spanned by the set of representative protein domains, we identify regions where function is locally conserved
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.