Abstract
MotivationProtein-binding sites prediction lays a foundation for functional annotation of protein and structure-based drug design. As the number of available protein structures increases, structural alignment based algorithm becomes the dominant approach for protein-binding sites prediction. However, the present algorithms underutilize the ever increasing numbers of three-dimensional protein–ligand complex structures (bound protein), and it could be improved on the process of alignment, selection of templates and clustering of template. Herein, we built so far the largest database of bound templates with stringent quality control. And on this basis, bSiteFinder as a protein-binding sites prediction server was developed.ResultsBy introducing Homology Indexing, Chain Length Indexing, Stability of Complex and Optimized Multiple-Templates Clustering into our algorithm, the efficiency of our server has been significantly improved. Further, the accuracy was approximately 2–10 % higher than that of other algorithms for the test with either bound dataset or unbound dataset. For 210 bound dataset, bSiteFinder achieved high accuracies up to 94.8 % (MCC 0.95). For another 48 bound/unbound dataset, bSiteFinder achieved high accuracies up to 93.8 % for bound proteins (MCC 0.95) and 85.4 % for unbound proteins (MCC 0.72). Our bSiteFinder server is freely available at http://binfo.shmtu.edu.cn/bsitefinder/, and the source code is provided at the methods page.ConclusionAn online bSiteFinder server is freely available at http://binfo.shmtu.edu.cn/bsitefinder/. Our work lays a foundation for functional annotation of protein and structure-based drug design. With ever increasing numbers of three-dimensional protein–ligand complex structures, our server should be more accurate and less time-consuming.Graphical bSiteFinder (http://binfo.shmtu.edu.cn/bsitefinder/) as a protein-binding sites prediction server was developed based on the largest database of bound templates so far with stringent quality control. By introducing Homology Indexing, Chain Length Indexing, Stability of Complex and Optimized Multiple-Templates Clustering into our algorithm, the efficiency of our server have been significantly improved. What’s more, the accuracy was approximately 2–10 % higher than that of other algorithms for the test with either bound dataset or unbound dataset
Highlights
Most biological processes involve the interaction of ligands with proteins
Performance of our algorithm and its comparison with others Two widely adopted datasets including 210 bound and 48 bound/unbound dataset [29] were used for testing our algorithm, and the results are shown in Tables 1 and 2
The efficiency of binding-sites prediction for unbound chains has been significantly increased benefiting from Homology Indexing, there are still some chains of no satisfactory homologous template structures, such as PDB ID: 4h12, CHAIN ID: A. For this kind of protein chains, we further introduce Chain Length Indexing to reduce the number of time-consuming structural alignments
Summary
Most biological processes involve the interaction of ligands with proteins. Functional characterization of ligand-binding sites of proteins is a key issue in understanding those biological processes [1–4]. Identifying the location of protein-binding sites is a vital first step in structure-based drug design [5–8]. To date, a variety of computational methods have been developed for protein-binding sites prediction, which can be divided into four categories: geometry based methods [9–14], energy based methods [15, 16], alignment based methods [17–20] and other miscellaneous methods [21–23]. Alignment based methods can be further divided into sequence alignment based and structural. Gao et al J Cheminform (2016) 8:38 alignment based methods. Structural alignment based methods exceeded other methods due to its more efficient and more accurate performance
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.