Virtual Screening with Gnina 1.0.

Jocelyn Sunseri,David Ryan Koes

doi:10.3390/molecules26237369

Abstract

Virtual screening—predicting which compounds within a specified compound library bind to a target molecule, typically a protein—is a fundamental task in the field of drug discovery. Doing virtual screening well provides tangible practical benefits, including reduced drug development costs, faster time to therapeutic viability, and fewer unforeseen side effects. As with most applied computational tasks, the algorithms currently used to perform virtual screening feature inherent tradeoffs between speed and accuracy. Furthermore, even theoretically rigorous, computationally intensive methods may fail to account for important effects relevant to whether a given compound will ultimately be usable as a drug. Here we investigate the virtual screening performance of the recently released Gnina molecular docking software, which uses deep convolutional networks to score protein-ligand structures. We find, on average, that Gnina outperforms conventional empirical scoring. The default scoring in Gnina outperforms the empirical AutoDock Vina scoring function on 89 of the 117 targets of the DUD-E and LIT-PCBA virtual screening benchmarks with a median 1% early enrichment factor that is more than twice that of Vina. However, we also find that issues of bias linger in these sets, even when not used directly to train models, and this bias obfuscates to what extent machine learning models are achieving their performance through a sophisticated interpretation of molecular interactions versus fitting to non-informative simplistic property distributions.

Highlights

Virtual screening poses this problem: given a target molecule and a set of compounds, rank the compounds so that all those that are active relative to the target are ranked ahead of those that are inactive
An in vitro screen is the source of ground truth for this binding classification problem, but there are at least four significant limitations associated with such screening: time and cost limit the number of screens that can be run; only compounds that physically exist can be screened this way; the screening process is not always accurate; and in vitro activity against a given target is necessary but not sufficient for identifying useful drugs
Force fields rely on physics-based terms mostly representing electrostatic interactions; empirical scoring functions may include counts of specific features as well as physics-inspired pairwise potentials; and knowledge-based statistical potentials calculate close contacts between molecules in structural databases and fit potentials biased toward structures that resemble this reference data

Summary

Introduction

Virtual screening poses this problem: given a target molecule and a set of compounds, rank the compounds so that all those that are active relative to the target are ranked ahead of those that are inactive. Virtual screening methods can be broadly classified as ligand-based or structure-based. Ligand-based methods rely on information about known active compounds and base their predictions on the similarity between compounds in the screening database and these known actives. Force fields rely on physics-based terms mostly representing electrostatic interactions; empirical scoring functions may include counts of specific features as well as physics-inspired pairwise potentials; and knowledge-based statistical potentials calculate close contacts between molecules in structural databases and fit potentials biased toward structures that resemble this reference data. Modern ML scoring functions tend to impose fewer restrictions on the final functional form and attempt to learn the relevant features from the data and prediction task itself (for example, they may consist of a neural network that processes the structural input directly)

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Molecules	Publication Date: Dec 4, 2021
Citations: 29	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Virtual Screening with Gnina 1.0.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecules

Lead the way for us

Similar Papers

How to pick a winning team: approaches towards the selection of computationally derived protein structures for ensemble-based virtual screening
Barbara Sander ... Jason Cole
Journal of Cheminformatics | VOL. 5
Barbara Sander, et. al.Barbara Sander ... Jason Cole
01 Mar 2013
Journal of Cheminformatics | VOL. 5

The Light and Dark Sides of Virtual Screening: What Is There to Know?
Aleix Gimeno ... Miquel Mulero
International Journal of Molecular Sciences | VOL. 20
Aleix Gimeno, et. al.Aleix Gimeno ... Miquel Mulero
19 Mar 2019
International Journal of Molecular Sciences | VOL. 20

Virtual screening for the discovery of bioactive natural products.
Judith M Rollinger ... Thierry Langer
Progress in drug research. Fortschritte der Arzneimittelforschung. Progres des recherches pharmaceutiques | VOL. 65
Judith M Rollinger, et. al.Judith M Rollinger ... Thierry Langer
01 Jan 2008
Progress in drug research. Fortschritte der Arzneimittelforschung. Progres des recherches pharmaceutiques | VOL. 65

Identifying and characterizing promiscuous targets: Implications for virtual screening
Violeta I Pérez-Nueno ... David W Ritchie
Expert Opinion on Drug Discovery | VOL. 7
Violeta I Pérez-Nueno, et. al.Violeta I Pérez-Nueno ... David W Ritchie
08 Nov 2011
Expert Opinion on Drug Discovery | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Virtual Screening with Gnina 1.0.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecules