Abstract

In this study, we developed a novel algorithm to improve the screening performance of an arbitrary docking scoring function by recalibrating the docking score of a query compound based on its structure similarity with a set of training compounds, while the extra computational cost is neglectable. Two popular docking methods, Glide and AutoDock Vina were adopted as the original scoring functions to be processed with our new algorithm and similar improvement performance was achieved. Predicted binding affinities were compared against experimental data from ChEMBL and DUD-E databases. 11 representative drug receptors from diverse drug target categories were applied to evaluate the hybrid scoring function. The effects of four different fingerprints (FP2, FP3, FP4, and MACCS) and the four different compound similarity effect (CSE) functions were explored. Encouragingly, the screening performance was significantly improved for all 11 drug targets especially when CSE = S4 (S is the Tanimoto structural similarity) and FP2 fingerprint were applied. The average predictive index (PI) values increased from 0.34 to 0.66 and 0.39 to 0.71 for the Glide and AutoDock vina scoring functions, respectively. To evaluate the performance of the calibration algorithm in drug lead identification, we also imposed an upper limit on the structural similarity to mimic the real scenario of screening diverse libraries for which query ligands are general-purpose screening compounds and they are not necessarily structurally similar to reference ligands. Encouragingly, we found our hybrid scoring function still outperformed the original docking scoring function. The hybrid scoring function was further evaluated using external datasets for two systems and we found the PI values increased from 0.24 to 0.46 and 0.14 to 0.42 for A2AR and CFX systems, respectively. In a conclusion, our calibration algorithm can significantly improve the virtual screening performance in both drug lead optimization and identification phases with neglectable computational cost.

Highlights

  • In order to save time and cost in drug discovery projects, various in silico approaches have been developed and applied to reduce the number of compounds which are to be experimentally synthesized and tested

  • How can we utilize the information on the known structures and activities to improve screening performance? Secondly, can we combine docking & scoring methods with the extremely fast methods of similarity calculations to improve the accuracy of binding affinity estimation? If so, how can we incorporate the two types of scores into one hybrid scoring function? In this work, we attempted to develop a novel algorithm to make a good use of those valuable information on known bioactive compounds

  • For all 11 systems, the scoring power measured by RMSE and MAE and ranking power measured by R­ 2 and predictive index (PI) of the original docking scoring function and the hybrid scoring functions applying different fingerprints and compound similarity effect (CSE) function are shown in Additional file 1: Tables S3, S4 and Fig. 2

Read more

Summary

Introduction

In order to save time and cost in drug discovery projects, various in silico approaches have been developed and applied to reduce the number of compounds which are to be experimentally synthesized and tested. Similarity search is a typical LBVS method, which predicts activity of query compounds depending on their similarities/dissimilarities to known reference ligands by utilizing numerical similarity descriptors (fingerprints) [3]. Both docking and similarity methods have been successfully carried out independently or hierarchically to screen out confidently inactive compounds for specific receptors of interest. Accuracy of docking methods is limited due to lack of modelling structural flexibility of target receptors, effects of solvation and entropy changes, etc These limitations of docking & scoring methods may be overcome by more accurate methodologies, such as end-point methods (MM-PBSA, MM-GBSA, LIE, etc.) [6, 7], or rigorous alchemical free energy methods (FEP, TI, etc.) [8, 9], with the price of much higher computational cost and much longer time. How can we utilize the information on the known structures and activities to improve screening performance? Secondly, can we combine docking & scoring methods with the extremely fast methods of similarity calculations to improve the accuracy of binding affinity estimation? If so, how can we incorporate the two types of scores into one hybrid scoring function? In this work, we attempted to develop a novel algorithm to make a good use of those valuable information on known bioactive compounds

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call