Abstract

Scoring functions are essential for modern in silico drug discovery. However, the accurate prediction of binding affinity by scoring functions remains a challenging task. The performance of scoring functions is very heterogeneous across different target classes. Scoring functions based on precise physics-based descriptors better representing protein–ligand recognition process are strongly needed. We developed a set of new empirical scoring functions, named DockTScore, by explicitly accounting for physics-based terms combined with machine learning. Target-specific scoring functions were developed for two important drug targets, proteases and protein–protein interactions, representing an original class of molecules for drug discovery. Multiple linear regression (MLR), support vector machine and random forest algorithms were employed to derive general and target-specific scoring functions involving optimized MMFF94S force-field terms, solvation and lipophilic interactions terms, and an improved term accounting for ligand torsional entropy contribution to ligand binding. DockTScore scoring functions demonstrated to be competitive with the current best-evaluated scoring functions in terms of binding energy prediction and ranking on four DUD-E datasets and will be useful for in silico drug design for diverse proteins as well as for specific targets such as proteases and protein–protein interactions. Currently, the MLR DockTScore is available at www.dockthor.lncc.br.

Highlights

  • Scoring functions are essential for modern in silico drug discovery

  • We developed a set of new empirical scoring functions, named DockTScore, to estimate protein–ligand binding affinity by explicitly accounting for physics-based interaction terms contributing to the binding free energy

  • It is interesting to note that the best correlation was obtained with the softened version EvdWS, which is expected because no energy minimization of the complex structures was performed

Read more

Summary

Introduction

Scoring functions are essential for modern in silico drug discovery. the accurate prediction of binding affinity by scoring functions remains a challenging task. Multiple linear regression (MLR), support vector machine and random forest algorithms were employed to derive general and target-specific scoring functions involving optimized MMFF94S force-field terms, solvation and lipophilic interactions terms, and an improved term accounting for ligand torsional entropy contribution to ligand binding. We developed a set of new empirical scoring functions, named DockTScore, to estimate protein–ligand binding affinity by explicitly accounting for physics-based interaction terms contributing to the binding free energy. DockTScore scoring functions incorporate classical van der Waals and electrostatic energy terms, optimized terms accounting for solvation, lipophilic protein–ligand interactions and an improved estimation of ligand torsional entropy contribution to ligand binding for better describing. The development of scoring functions using physics-based descriptors representing protein–ligand recognition process together with the assessment of the accuracies of different linear and nonlinear models are important to avoid unrealistic overestimations of scoring functions accuracy due to some known biases, especially when training nonlinear ­models[4,6,34,35]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call