FunFam protein families improve residue level molecular function prediction

Linus Scheibenreif,Christine Orengo,Maria Littmann,Burkhard Rost

doi:10.1186/s12859-019-2988-x

Abstract

BackgroundThe CATH database provides a hierarchical classification of protein domain structures including a sub-classification of superfamilies into functional families (FunFams). We analyzed the similarity of binding site annotations in these FunFams and incorporated FunFams into the prediction of protein binding residues.ResultsFunFam members agreed, on average, in 36.9 ± 0.6% of their binding residue annotations. This constituted a 6.7-fold increase over randomly grouped proteins and a 1.2-fold increase (1.1-fold on the same dataset) over proteins with the same enzymatic function (identical Enzyme Commission, EC, number). Mapping de novo binding residue prediction methods (BindPredict-CCS, BindPredict-CC) onto FunFam resulted in consensus predictions for those residues that were aligned and predicted alike (binding/non-binding) within a FunFam. This simple consensus increased the F1-score (for binding) 1.5-fold over the original prediction method. Variation of the threshold for how many proteins in the consensus prediction had to agree provided a convenient control of accuracy/precision and coverage/recall, e.g. reaching a precision as high as 60.8 ± 0.4% for a stringent threshold.ConclusionsThe FunFams outperformed even the carefully curated EC numbers in terms of agreement of binding site residues. Additionally, we assume that our proof-of-principle through the prediction of protein binding residues will be relevant for many other solutions profiting from FunFams to infer functional information at the residue level.

Highlights

The Class Architecture Topology Homology (CATH) database provides a hierarchical classification of protein domain structures including a sub-classification of superfamilies into functional families (FunFams)
FunFams automatically classify all proteins, they covered binding residue similarity about 20% better (1.2-fold increase, 1.1-fold on same dataset) than the expert curated Enzyme Commission (EC) numbers (Enzyme Classification) identical on all four digits for the particular classification of enzymes and about 20% (1.2-fold increase) better than PROSITE patterns or Pfam families
The high similarity of binding residues for proteins with the same EC number mostly originated from the same FunFam

Summary

Introduction

The CATH database provides a hierarchical classification of protein domain structures including a sub-classification of superfamilies into functional families (FunFams). Since the annotation of protein function, e.g. through GO or EC numbers, often precedes the experimental unravelling of molecular details, our molecular proxy effectively removed the circularity thereby providing an independent means of assessing functional classifications. We added another element, namely results from two methods predicting binding residues exclusively through information available from the sequence (dubbed BindPredict-CCS and BindPredictCC [15]). We expected to be able to leverage the FunFams clustering to filter binding residue predictions as exemplified by two methods tested (Fig. 1)

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jul 18, 2019
Citations: 22	License type: open-access

R Discovery Prime

R Discovery Prime

FunFam protein families improve residue level molecular function prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Prediction of protein-ATP binding residues using multi-view feature learning via contextual-based co-attention network
Jia-Shun Wu ... Dong-Jun Yu
Computers in Biology and Medicine | VOL. 172
Jia-Shun Wu, et. al.Jia-Shun Wu ... Dong-Jun Yu
04 Mar 2024
Computers in Biology and Medicine | VOL. 172

BindWeb: A web server for ligand binding residue and pocket prediction from protein structures.
Ying Xia ... Hong‐Bin Shen
Protein science : a publication of the Protein Society | VOL. 31
Ying Xia, et. al.Ying Xia ... Hong‐Bin Shen
16 Nov 2022
Protein science : a publication of the Protein Society | VOL. 31

Prediction of acid radical ion binding residues by K-nearest neighbors classifier
Liu Liu ... Shan Wang
BMC Molecular and Cell Biology | VOL. 20
Liu Liu, et. al.Liu Liu ... Shan Wang
01 Dec 2019
BMC Molecular and Cell Biology | VOL. 20

Structure Based Prediction of Binding Residues on DNA-binding Proteins
N Bhardwaj ... Guijun Zhao
-
N Bhardwaj, et. al.N Bhardwaj ... Guijun Zhao
01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FunFam protein families improve residue level molecular function prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics