Abstract

BackgroundIdentifying ligand-binding sites is a key step to annotate the protein functions and to find applications in drug design. Now, many sequence-based methods adopted various predicted results from other classifiers, such as predicted secondary structure, predicted solvent accessibility and predicted disorder probabilities, to combine with position-specific scoring matrix (PSSM) as input for binding sites prediction. These predicted features not only easily result in high-dimensional feature space, but also greatly increased the complexity of algorithms. Moreover, the performances of these predictors are also largely influenced by the other classifiers.ResultsIn order to verify that conservation is the most powerful attribute in identifying ligand-binding sites, and to show the importance of revising PSSM to match the detailed conservation pattern of functional site in prediction, we have analyzed the Adenosine-5'-triphosphate (ATP) ligand as an example, and proposed a simple method for ATP-binding sites prediction, named as CLCLpred (Contextual Local evolutionary Conservation-based method for Ligand-binding prediction). Our method employed no predicted results from other classifiers as input; all used features were extracted from PSSM only. We tested our method on 2 separate data sets. Experimental results showed that, comparing with other 9 existing methods on the same data sets, our method achieved the best performance.ConclusionsThis study demonstrates that: 1) exploiting the signal from the detailed conservation pattern of residues will largely facilitate the prediction of protein functional sites; and 2) the local evolutionary conservation enables accurate prediction of ATP-binding sites directly from protein sequence.

Highlights

  • Identifying ligand-binding sites is a key step to annotate the protein functions and to find applications in drug design

  • Our method is based on the assumptions that: 1) the most effective features for predicting functional sites are embedded in the sequence itself; 2) the local evolutionary conservation is distinct enough to enable an accurate prediction of ATP-binding sites directly from amino acid sequence, without requiring any additional predicted structural information

  • Only the high local evolutionary conservation scores in position-specific scoring matrix (PSSM) are considered as input, without employing any predicted features from other classifiers

Read more

Summary

Introduction

Identifying ligand-binding sites is a key step to annotate the protein functions and to find applications in drug design. Many sequence-based methods adopted various predicted results from other classifiers, such as predicted secondary structure, predicted solvent accessibility and predicted disorder probabilities, to combine with position-specific scoring matrix (PSSM) as input for binding sites prediction. These predicted features result in high-dimensional feature space, and greatly increased the complexity of algorithms. More simple and high efficient method for identifying interacting residues is indispensable

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call