Abstract

With the exponential growth in the determination of protein sequences and structures via genome sequencing and structural genomics efforts, there is a growing need for reliable computational methods to determine the biochemical function of these proteins. This paper reviews the efforts to address the challenge of annotating the function at the molecular level of uncharacterized proteins. While sequence- and three-dimensional-structure-based methods for protein function prediction have been reviewed previously, the recent trends in local structure-based methods have received less attention. These local structure-based methods are the primary focus of this review. Computational methods have been developed to predict the residues important for catalysis and the local spatial arrangements of these residues can be used to identify protein function. In addition, the combination of different types of methods can help obtain more information and better predictions of function for proteins of unknown function. Global initiatives, including the Enzyme Function Initiative (EFI), COMputational BRidges to EXperiments (COMBREX), and the Critical Assessment of Function Annotation (CAFA), are evaluating and testing the different approaches to predicting the function of proteins of unknown function. These initiatives and global collaborations will increase the capability and reliability of methods to predict biochemical function computationally and will add substantial value to the current volume of structural genomics data by reducing the number of absent or inaccurate functional annotations.

Highlights

  • The number of protein sequences and structures in databases such as UniProt [1] and the Protein Data Bank (PDB) [2] has grown significantly since the inception of genome sequencing and high-throughput structure determination

  • The process of annotating proteins of unknown and uncertain functions continues to be challenging yet critical for understanding the enormous amount of information generated by genome sequencing and structural genomics projects

  • Function prediction methods that focus on the local spatial region of biochemical activity show promise for improving predictive capability

Read more

Summary

Introduction

The number of protein sequences and structures in databases such as UniProt [1] and the Protein Data Bank (PDB) [2] has grown significantly since the inception of genome sequencing and high-throughput structure determination. Since the PSI has been primarily concerned with high volume structure determination and prompt public availability of protein structures, most of these protein structures lack reliable accompanying information regarding their biochemical function; in some cases, no functional annotation is given. Most of these proteins are assigned a putative or possible function based on the closest sequence or structure match; these assignments are often incorrect [8,9,10], and these incorrect functional labels can propagate within databases [11,12]. The development and implementation of new, reliable computational methods is an important aspect of a solution to the challenge of assignment of function to proteins

Functional site prediction methods
Sequence-based methods
Structure-based methods
Combined methods
Local active site prediction methods
Community initiatives and projects
Findings
Summary and outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call