ConfRank: Improving GFN-FF Conformer Ranking with Pairwise Training.

Christian Hölzer,Rick Oerder,Stefan Grimme,Jan Hamaekers

doi:10.1021/acs.jcim.4c01524

Abstract

Conformer ranking is a crucial task for drug discovery, with methods for generating conformers often based on molecular (meta)dynamics or sophisticated sampling techniques. These methods are constrained by the underlying force computation regarding runtime and energy ranking accuracy, limiting their effectiveness for large-scale screening applications. To address these ranking limitations, we introduce ConfRank, a machine learning-based approach that enhances conformer ranking using pairwise training. We demonstrate its performance using GFN-FF-generated conformer ensembles, leveraging the DimeNet++ architecture trained on pairs of 159 760 uncharged organic compounds from the GEOM data set with r2SCAN-3c reference level. Instead of predicting only on single molecules, this approach captures relative energy differences between conformers, leading to a significant improvement of the overall conformational ranking, outperforming GFN-FF and GFN2-xTB. Thereby, the pairwise RMSD of the relative energy difference of two conformers can be reduced from 5.65 to 0.71 kcal mol-1 on the test data set, allowing to correctly identify up to 81% of all lowest lying conformers correctly (GFN-FF: 10%, GFN2-xTB: 47%). The ConfRank approach is cost-effective, allowing for scalable deployment on both CPU and GPU, achieving runtime accelerations by up to 2 orders of magnitude compared to GFN2-xTB. Out-of-sample investigations on CREST-generated conformer ensembles from the QM9 data set and conformers taken from an extended GMTKN55 data set show promising results for the robustness of this approach. Thereby, ranking correlation coefficient such as Spearman can be improved to 0.90 (GFN-FF: 0.39, GFN2-xTB: 0.84) reducing the probability of an incorrect sign flip in pairwise energy comparison from 32 to 7%. On the extended GMTKN55 subsets the pairwise MAD (RMSD) could be reduced on almost all subsets by up to 62% (58%) with an average improvement of 30% (29%). Moreover, an exemplary case study on vancomycin shows similar performance, indicating applicability to larger (bio)molecular structures. Furthermore, we motivate the usage of the pairwise training approach from a theoretical perspective, highlighting that while pairwise training can lead to a decline in single sample prediction of absolute energies for ML models, it significantly enhances conformer ranking performance. The data and models used in this study are available at https://github.com/grimme-lab/confrank.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

ConfRank: Improving GFN-FF Conformer Ranking with Pairwise Training.

Abstract

Talk to us

Similar Papers

More From: Journal of chemical information and modeling

Lead the way for us

Similar Papers

Способ исследования устойчивости систем со встроенным искусственным интеллектом, использующихся на промышленных объектах, к состязательным атакам
Alisa A Vorobeva
Proceedings of Tomsk State University of Control Systems and Radioelectronics | VOL. 26
Alisa A VorobevaAlisa A Vorobeva
01 Jan 2023
Proceedings of Tomsk State University of Control Systems and Radioelectronics | VOL. 26

Groundwater balance estimators using Machine Learning 
Sreekanth Janardhanan ... Dan Mackinlay
-
Sreekanth Janardhanan, et. al.Sreekanth Janardhanan ... Dan Mackinlay
28 Mar 2022
28 Mar 2022

Alternative Approach to Chemical Accuracy: A Neural Networks-Based First-Principles Method for Heat of Formation of Molecules Made of H, C, N, O, F, S, and Cl
Jian Sun ... Lihong Hu
The Journal of Physical Chemistry A | VOL. 118
Jian Sun, et. al.Jian Sun ... Lihong Hu
11 Jul 2014
The Journal of Physical Chemistry A | VOL. 118

Developing alternative regression models for describing water quality using a self-organizing map
Seo Jin Ki ... Joon Ha Kim
Desalination and Water Treatment | VOL. 57
Seo Jin Ki, et. al.Seo Jin Ki ... Joon Ha Kim
26 Nov 2015
Desalination and Water Treatment | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ConfRank: Improving GFN-FF Conformer Ranking with Pairwise Training.

Abstract

Talk to us

Similar Papers

More From: Journal of chemical information and modeling