Abstract

Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues.

Highlights

  • The prediction of protein function from structure has become of increasing interest as a significant proportion [1] of structures solved by the structural genomics initiatives (SGI) lack functional annotation

  • Structural Genomics Initiatives have been set up to investigate these structures on a large scale and make the data available to the wider biological research community

  • FLORA was designed as a generic method to create structural motifs that can discriminate between different functional subgroups (FSGs) within diverse domain superfamilies, purely using patterns of structural conservation — FLORA makes no assumptions as to the physico-chemical properties of functionally important residues and uses a purely structure-based conservation score

Read more

Summary

Introduction

The prediction of protein function from structure has become of increasing interest as a significant proportion [1] of structures solved by the structural genomics initiatives (SGI) lack functional annotation (for a review see [2]). The problems with taking this approach are deciding what qualifies as a functional residue (e.g. one directly involved in catalysis or ligand binding) and creating biologically-accurate templates for the ever increasing number of available protein structures being deposited in the PDB [4]. Resources such as the Catalytic Site Atlas [5] are carefully curated by hand and restricted to residues directly involved in catalysis, whereas MSDSite [6] and PDBSite [7,8] generate templates based on active site residues defined in the PDB file by the authors. These resources are undoubtedly extremely valuable, it is questionable whether sufficient coverage of the PDB can be maintained when manual intervention is required

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.