Abstract

Predictions of interactions between target proteins and potential leads are of great benefit in the drug discovery process. We present a comprehensively applicable statistical prediction method for interactions between any proteins and chemical compounds, which requires only protein sequence data and chemical structure data and utilizes the statistical learning method of support vector machines. In order to realize reasonable comprehensive predictions which can involve many false positives, we propose two approaches for reduction of false positives: (i) efficient use of multiple statistical prediction models in the framework of two-layer SVM and (ii) reasonable design of the negative data to construct statistical prediction models. In two-layer SVM, outputs produced by the first-layer SVM models, which are constructed with different negative samples and reflect different aspects of classifications, are utilized as inputs to the second-layer SVM. In order to design negative data which produce fewer false positive predictions, we iteratively construct SVM models or classification boundaries from positive and tentative negative samples and select additional negative sample candidates according to pre-determined rules. Moreover, in order to fully utilize the advantages of statistical learning methods, we propose a strategy to effectively feedback experimental results to computational predictions with consideration of biological effects of interest. We show the usefulness of our approach in predicting potential ligands binding to human androgen receptors from more than 19 million chemical compounds and verifying these predictions by in vitro binding. Moreover, we utilize this experimental validation as feedback to enhance subsequent computational predictions, and experimentally validate these predictions again. This efficient procedure of the iteration of the in silico prediction and in vitro or in vivo experimental verifications with the sufficient feedback enabled us to identify novel ligand candidates which were distant from known ligands in the chemical space.

Highlights

  • In the early stages of the drug discovery process, prediction of the binding of a chemical compound to a specific protein can be of great benefit in the identification of lead compounds

  • As our method can be utilized for virtual screening that seeks for lead compounds in drug discovery, we showed the usefulness of our method in its application to the comprehensive prediction of ligands binding to human androgen receptors and in vitro experimental verification of its predictions

  • In contrast to most previous virtual screening studies which predict chemical compounds of interest mainly with 3D structurebased methods and experimentally verify them, we proposed a strategy to effectively feedback experimental results for subsequent predictions and applied the strategy to the second predictions followed by the second experimental verification

Read more

Summary

Introduction

In the early stages of the drug discovery process, prediction of the binding of a chemical compound to a specific protein can be of great benefit in the identification of lead compounds (candidates for a new drug). The effective screening of potential drug candidates at an early stage generates large cost savings at a later stage of the overall drug discovery process. Fast and accurate statistical prediction methods for binding affinities of any pair of a protein and a ligand have been proposed for the case where information regarding 3D structures, binding pockets and binding affinities (e.g. pKi) for a sufficient number of pairs of proteins and chemical compounds is available [5]. Among the GPCRs (G-protein coupled receptors), whose modulation underlies the actions of 30% of the best-known commercial drugs [7], the full structure of only a few mammalian members, including bovine rhodopsin [8] and human beta 2 adrenoreceptor [9], is known

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.