Quantum\u2013mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?

Nicolas Tielker,K Friedemann Schmidt,Gerhard Hessler,Stefan Güssregen,Stefan M Kast,Lukas Eberlein

doi:10.1007/s10822-020-00347-5

Abstract

Joint academic–industrial projects supporting drug discovery are frequently pursued to deploy and benchmark cutting-edge methodical developments from academia in a real-world industrial environment at different scales. The dimensionality of tasks ranges from small molecule physicochemical property assessment over protein–ligand interaction up to statistical analyses of biological data. This way, method development and usability both benefit from insights gained at both ends, when predictiveness and readiness of novel approaches are confirmed, but the pharmaceutical drug makers get early access to novel tools for the quality of drug products and benefit of patients. Quantum–mechanical and simulation methods particularly fall into this group of methods, as they require skills and expense in their development but also significant resources in their application, thus are comparatively slowly dripping into the realm of industrial use. Nevertheless, these physics-based methods are becoming more and more useful. Starting with a general overview of these and in particular quantum–mechanical methods for drug discovery we review a decade-long and ongoing collaboration between Sanofi and the Kast group focused on the application of the embedded cluster reference interaction site model (EC-RISM), a solvation model for quantum chemistry, to study small molecule chemistry in the context of joint participation in several SAMPL (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenges. Starting with early application to tautomer equilibria in water (SAMPL2) the methodology was further developed to allow for challenge contributions related to predictions of distribution coefficients (SAMPL5) and acidity constants (SAMPL6) over the years. Particular emphasis is put on a frequently overlooked aspect of measuring the quality of models, namely the retrospective analysis of earlier datasets and predictions in light of more recent and advanced developments. We therefore demonstrate the performance of the current methodical state of the art as developed and optimized for the SAMPL6 pKa and octanol–water log P challenges when re-applied to the earlier SAMPL5 cyclohexane-water log D and SAMPL2 tautomer equilibria datasets. Systematic improvement is not consistently found throughout despite the similarity of the problem class, i.e. protonation reactions and phase distribution. Hence, it is possible to learn about hidden bias in model assessment, as results derived from more elaborate methods do not necessarily improve quantitative agreement. This indicates the role of chance or coincidence for model development on the one hand which allows for the identification of systematic error and opportunities toward improvement and reveals possible sources of experimental uncertainty on the other. These insights are particularly useful for further academia–industry collaborations, as both partners are then enabled to optimize both the computational and experimental settings for data generation.

Highlights

Physics‐based modeling in drug discoveryDrug discovery is a multidimensional optimization journey starting off from early hit molecules with multiple liabilities to a clinical candidate with desired pharmacokinetic/1 3 Vol.:(0123456789)Journal of Computer-Aided Molecular Design (2021) 35:453–472 pharmacodynamic (PK/PD) and safety profile, which requires the parallel monitoring of different properties
Protein–ligand binding pose and affinity predictions were spun-off from SAMPL in the form of “Grand Challenges” organized by the Drug Design Data Resource (D3R) [41] while SAMPL5 was devoted to host–guest binding on the one hand [42] and—as an extraordinarily more complicated problem compared to earlier small molecule SAMPL challenges—distribution coefficients between water and cyclohexane on the other [43]
While the SAMPL5 post-submission pKa model was still based on self-consistent atomic site charges for determining the electrostatic contribution to the solute–solvent interactions, we turned to an embedded cluster reference interaction site model (EC-RISM) variant that allows for using the electrostatic potential arising from the solute’s wave function directly, i.e. formally in an exact manner

Summary

Introduction

Journal of Computer-Aided Molecular Design (2021) 35:453–472 pharmacodynamic (PK/PD) and safety profile, which requires the parallel monitoring of different properties. Force field-based methods that use simple physics models for bonded and non-bonded interactions are well established for investigating conformational properties of proteins and drug-like compounds [14, 15] in drug discovery They are fast, easy to apply and provide results in a time frame that fits neatly to the requirements of design-make-test-analyze cycles used by project teams throughout industry. In the group of the authors, hybrid methods have been pioneered, embedding machine learning and time-dependent TD-DFT calculations to determine UV/vis spectral absorption descriptors of drug candidates in solution Beyond property prediction, this method was shown effective for the detection of fragmental contributions to toxicity, a key prerequisite for visualization and helpful for guiding drug optimization [33]. This emphasizes the importance of an exchange of pre-competitive data between industry and academia for method development and for measuring and assessing the predictive quality of the tools

Background of SAMPL blind prediction challenges

Results and discussion

Concluding discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Computer-Aided Molecular Design	Publication Date: Oct 20, 2020
Citations: 9	License type: open-access

R Discovery Prime

R Discovery Prime

Quantum\u2013mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer-Aided Molecular Design

Lead the way for us

Similar Papers

The SAMPL5 challenge for embedded-cluster integral equation theory: solvation free energies, aqueous pK a, and cyclohexane-water log D.
Nicolas Tielker ... Sebastian Ehrhart
Journal of Computer-Aided Molecular Design | VOL. 30
Nicolas Tielker, et. al.Nicolas Tielker ... Sebastian Ehrhart
23 Aug 2016
Journal of Computer-Aided Molecular Design | VOL. 30

Solvation effects on chemical shifts by embedded cluster integral equation theory.
Roland Frach ... Stefan M Kast
The journal of physical chemistry. A | VOL. 118
Roland Frach, et. al.Roland Frach ... Stefan M Kast
20 Nov 2014
The journal of physical chemistry. A | VOL. 118

SAMPL7 physical property prediction from EC-RISM theory
Nicolas Tielker ... Stefan M Kast
Journal of computer-aided molecular design | VOL. 35
Nicolas Tielker, et. al.Nicolas Tielker ... Stefan M Kast
19 Jul 2021
Journal of computer-aided molecular design | VOL. 35

The SAMPL6 challenge on predicting aqueous pKa values from EC-RISM theory.
Nicolas Tielker ... Stefan M Kast
Journal of Computer-Aided Molecular Design | VOL. 32
Nicolas Tielker, et. al.Nicolas Tielker ... Stefan M Kast
02 Aug 2018
Journal of Computer-Aided Molecular Design | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Quantum\u2013mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer-Aided Molecular Design