Machine learning based energy-free structure predictions of molecules, transition states, and solids

Dominik Lemm,Guido Falk Von Rudorff,O Anatole Von Lilienfeld

doi:10.1038/s41467-021-24525-7

Dominik Lemm, Guido Falk Von Rudorff + Show 1 more

Open Access

https://doi.org/10.1038/s41467-021-24525-7

Copy DOI

Journal: Nature Communications	Publication Date: Jul 22, 2021
Citations: 62	License type: open-access

Affiliation: University of Vienna, University of Basel

Abstract

The computational prediction of atomistic structure is a long-standing problem in physics, chemistry, materials, and biology. Conventionally, force-fields or ab initio methods determine structure through energy minimization, which is either approximate or computationally demanding. This accuracy/cost trade-off prohibits the generation of synthetic big data sets accounting for chemical space with atomistic detail. Exploiting implicit correlations among relaxed structures in training data sets, our machine learning model Graph-To-Structure (G2S) generalizes across compound space in order to infer interatomic distances for out-of-sample compounds, effectively enabling the direct reconstruction of coordinates, and thereby bypassing the conventional energy optimization task. The numerical evidence collected includes 3D coordinate predictions for organic molecules, transition states, and crystalline solids. G2S improves systematically with training set size, reaching mean absolute interatomic distance prediction errors of less than 0.2 Å for less than eight thousand training structures — on par or better than conventional structure generators. Applicability tests of G2S include successful predictions for systems which typically require manual intervention, improved initial guesses for subsequent conventional ab initio based relaxation, and input generation for subsequent use of structure based quantum machine learning models.

Highlights

After training on sufficiently many examples, we find that G2S generated structures for out-ofsample graphs have a lower root-mean-square deviation (RMSD) than structures from ETKDG6 and Gen3D7 and exhibit high geometric similarity to the reference quantum chemical structure
We have presented G2S, a machine learning model capable of reconstructing 3D atomic coordinates from predicted interatomic distances using bond-network and stoichiometry as input
The applicability of G2S has been demonstrated for predicting structures of a variety of system classes including closed-shell organic molecules, transition state geometries, singlet carbene geometries, and crystal structures

Summary

Results

For atomization energy prediction of C7O2H10 and C7NOH11 isomers, G2S and FCHL19 still reaches an accuracy of 5 kcal/mol mean absolute error (MAE) at 1024 training points, slowly approaching the coveted chemical accuracy of 1 kcal/mol, and almost matching the accuracy of a DFT structure-based BoB model. The advantage is most substantial for the small training set, in the limit of larger data sets, the performance curves of predictions based on G2S input level off, presumably due to the noise levels introduced by aforementioned error type B, i.e., inherent noise and conformational effects of the predicted structures. Possible further strategies to improve on G2S could include Δ-machine learning[28] where deviations from tabulated (or universal force-field based) estimates are modeled

Discussion

Methods

Code availability

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine learning based energy-free structure predictions of molecules, transition states, and solids

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature Communications

Lead the way for us

Similar Papers

A Combinatorial Approach to Fairness Testing of Machine Learning Models
Ankita Ramjibhai Patel ... Jaganmohan Chandrasekaran
-
Ankita Ramjibhai Patel, et. al.Ankita Ramjibhai Patel ... Jaganmohan Chandrasekaran
01 Apr 2022
01 Apr 2022

Predicting the risk of inappropriate depth of endotracheal intubation in pediatric patients using machine learning approaches
Jae-Geum Shim ... Jin Hee Ahn
Scientific Reports | VOL. 13
Jae-Geum Shim, et. al.Jae-Geum Shim ... Jin Hee Ahn
29 Mar 2023
Scientific Reports | VOL. 13

Substrate Turnover Dynamics Guide Ketol-Acid Reductoisomerase Redesign for Increased Specific Activity.
Elijah Karvelis ... Bruce Tidor
ACS catalysis | VOL. 14
Elijah Karvelis, et. al.Elijah Karvelis ... Bruce Tidor
26 Jun 2024
ACS catalysis | VOL. 14

Development of interpretable machine learning models to predict in-hospital prognosis of acute heart failure patients.
Munekazu Tanaka ... Moritake Iguchi
ESC heart failure | VOL. 11
Munekazu Tanaka, et. al.Munekazu Tanaka ... Moritake Iguchi
15 May 2024
ESC heart failure | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine learning based energy-free structure predictions of molecules, transition states, and solids

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature Communications