Abstract

The estimation of prediction quality is important because without quality measures, it is difficult to determine the usefulness of a prediction. Currently, methods for ligand binding site residue predictions are assessed in the function prediction category of the biennial Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiment, utilizing the Matthews Correlation Coefficient (MCC) and Binding-site Distance Test (BDT) metrics. However, the assessment of ligand binding site predictions using such metrics requires the availability of solved structures with bound ligands. Thus, we have developed a ligand binding site quality assessment tool, FunFOLDQA, which utilizes protein feature analysis to predict ligand binding site quality prior to the experimental solution of the protein structures and their ligand interactions. The FunFOLDQA feature scores were combined using: simple linear combinations, multiple linear regression and a neural network. The neural network produced significantly better results for correlations to both the MCC and BDT scores, according to Kendall’s τ, Spearman’s ρ and Pearson’s r correlation coefficients, when tested on both the CASP8 and CASP9 datasets. The neural network also produced the largest Area Under the Curve score (AUC) when Receiver Operator Characteristic (ROC) analysis was undertaken for the CASP8 dataset. Furthermore, the FunFOLDQA algorithm incorporating the neural network, is shown to add value to FunFOLD, when both methods are employed in combination. This results in a statistically significant improvement over all of the best server methods, the FunFOLD method (6.43%), and one of the top manual groups (FN293) tested on the CASP8 dataset. The FunFOLDQA method was also found to be competitive with the top server methods when tested on the CASP9 dataset. To the best of our knowledge, FunFOLDQA is the first attempt to develop a method that can be used to assess ligand binding site prediction quality, in the absence of experimental data.

Highlights

  • Proteins are essential molecules in all living organisms and are involved in virtually all cellular processes, including; transportation within and between cells, energy generation, catalysis, signalling, defence and maintaining the structural integrity of cells

  • The scores were based on several features we found to be important in determining a confident prediction from our development work for the FunFOLD algorithm and from our manual function prediction submissions for CASP9

  • The performance of FunFOLDQA is compared against that of groups that participated in the CASP8 and CASP9

Read more

Summary

Introduction

Proteins are essential molecules in all living organisms and are involved in virtually all cellular processes, including; transportation within and between cells, energy generation, catalysis, signalling, defence and maintaining the structural integrity of cells. The development of numerous protein ligand binding site prediction methods has been driven by the recent inclusion of the function prediction category in CASP [6]. Ligand binding site prediction methods are subdivided into two broad groupings: sequence-based methods and structure based-methods [7]. The sequence based methods utilize sequence conservations of structurally or functionally important residues, these methods include firestar (CASP9 – group FN315) [8,9], WSsas [10], FRcons [11], ConFunc (CASP8 - FN437) [12], ConSurf [13], FPSDP (CASP8 - FN242) [14], INTREPID [15] and ss-TEA [16]. Structure based methods can be further separated into geometric methods (FINDSITE [17] and Surflex-PSIM [18]), energetic methods (SITEHOUND [19]) and miscellaneous methods, which utilize knowledge from homology modelling (FunFOLD – CASP9 FN425 [4], 3DLigandSite –CASP9 FN017, FN057, FN072 and FN415 [20] and I-TASSER_FUNCTION – CASP9 FN339 [21]), surface accessibility (LIGSITECSC [22]) and physiochemical properties (SCREEN [23])

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.