Real-time prediction of 1H and 13C chemical shifts with DFT accuracy using a 3D graph neural network.

Yanfei Guan,Liliana C Gallegos,Robert S Paton,Peter C St John,S V Shree Sowndarya

doi:10.1039/d1sc03343c

Yanfei Guan, Liliana C Gallegos + Show 3 more

Open Access

https://doi.org/10.1039/d1sc03343c

Copy DOI

Abstract

Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate the chemical structure, bonding, stereochemistry, and conformation of organic compounds. The distinct chemical shifts in an NMR spectrum depend upon each atom's local chemical environment and are influenced by both through-bond and through-space interactions with other atoms and functional groups. The in silico prediction of NMR chemical shifts using quantum mechanical (QM) calculations is now commonplace in aiding organic structural assignment since spectra can be computed for several candidate structures and then compared with experimental values to find the best possible match. However, the computational demands of calculating multiple structural- and stereo-isomers, each of which may typically exist as an ensemble of rapidly-interconverting conformations, are expensive. Additionally, the QM predictions themselves may lack sufficient accuracy to identify a correct structure. In this work, we address both of these shortcomings by developing a rapid machine learning (ML) protocol to predict 1H and 13C chemical shifts through an efficient graph neural network (GNN) using 3D structures as input. Transfer learning with experimental data is used to improve the final prediction accuracy of a model trained using QM calculations. When tested on the CHESHIRE dataset, the proposed model predicts observed 13C chemical shifts with comparable accuracy to the best-performing DFT functionals (1.5 ppm) in around 1/6000 of the CPU time. An automated prediction webserver and graphical interface are accessible online at http://nova.chem.colostate.edu/cascade/. We further demonstrate the model in three applications: first, we use the model to decide the correct organic structure from candidates through experimental spectra, including complex stereoisomers; second, we automatically detect and revise incorrect chemical shift assignments in a popular NMR database, the NMRShiftDB; and third, we use NMR chemical shifts as descriptors for determination of the sites of electrophilic aromatic substitution.

Highlights

Nuclear magnetic resonance (NMR) spectra are a primary source of molecular structural information
We demonstrate improvements in the predictive accuracy of a density functional theory (DFT)-trained model by applying transfer learning (TL) with a smaller collection of experimental values: following model retraining against a curated set of 13C experimental shi s, a mean absolute error (MAE) of 1.23 ppm against experiment could be obtained for 500 held-out structures
The success of this approach arises from the strong correlation between DFT chemical shi s and experimental shi s, the molecular structures shared by DFT8K and Exp5K, and the strategy of freezing 94% of graph neural network (GNN) hyperparameters during TL

Summary

Introduction

To serve as a useful tool for structure elucidation, prediction errors in computed chemical shi s must be smaller than the experimental variations between different candidate structures. The need for a repository of publicly accessible raw NMR data has been articulated elsewhere.[54] To address these challenges, we set out to exploit advances in quantum chemistry, high-performance computing, and automation in developing a large dataset of QM computed values to train an ML model.[38,40,44,45,55,56,57] A principal advantage of this approach is that DFT-based predictions of chemical shi s can be mapped to the responsible atom in a high-throughput fashion with complete reliability, avoiding incomplete or erroneous assignments and the need for manual intervention. The success of this approach arises from the strong correlation between DFT chemical shi s and experimental shi s, the molecular structures shared by DFT8K and Exp5K, and the strategy of freezing 94% of GNN hyperparameters during TL. These GNN-derived atomic descriptors impose low computational cost such that we anticipate future utility in related prediction tasks of organic reactivity and selectivity, for example in combination with other machine-learned representations.[87]

Conclusion

Findings

Methods

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Chemical Science	Publication Date: Jan 1, 2021
Citations: 68	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Real-time prediction of 1H and 13C chemical shifts with DFT accuracy using a 3D graph neural network.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Chemical Science

Lead the way for us

Similar Papers

Automated fragmentation quantum mechanical calculation of 13C and 1H chemical shifts in molecular crystals.
Man Shi ... Xiao He
The Journal of chemical physics | VOL. 154
Man Shi, et. al.Man Shi ... Xiao He
10 Feb 2021
The Journal of chemical physics | VOL. 154

COLMARppm: A Web Server Tool for the Accurate and Rapid Prediction of 1H and 13C NMR Chemical Shifts of Organic Molecules and Metabolites.
Nick Rigel ... Rafael Brüschweiler
Analytical chemistry | VOL. 96
Nick Rigel, et. al.Nick Rigel ... Rafael Brüschweiler
29 Dec 2023
Analytical chemistry | VOL. 96

An initial investigation of accuracy required for the identification of small molecules in complex samples using quantum chemical calculated NMR chemical shifts
Yasemin Yesiltepe ... Ryan S Renslow
Journal of cheminformatics | VOL. 14
Yasemin Yesiltepe, et. al.Yasemin Yesiltepe ... Ryan S Renslow
22 Sep 2022
Journal of cheminformatics | VOL. 14

Impact of intramolecular hydrogen bonding of gallic acid conformers on chemical shift through NMR spectroscopy
Frederick Backler ... Feng Wang
Journal of Molecular Graphics and Modelling | VOL. 95
Frederick Backler, et. al.Frederick Backler ... Feng Wang
06 Nov 2019
Journal of Molecular Graphics and Modelling | VOL. 95

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Real-time prediction of 1H and 13C chemical shifts with DFT accuracy using a 3D graph neural network.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Chemical Science