A transferable active-learning strategy for reactive molecular force fields.

Tom A Young,Fernanda Duarte,Volker L Deringer,Tristan Johnston-Wood

doi:10.1039/d1sc01825f

Tom A Young, Fernanda Duarte + Show 2 more

Open Access

PDF Available

https://doi.org/10.1039/d1sc01825f

Copy DOI

Export

Save

Cite

Journal: Chemical Science	Publication Date: Jan 1, 2021
Citations: 46	License type: CC BY 3.0

Affiliation: University of Oxford

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Predictive molecular simulations require fast, accurate and reactive interatomic potentials. Machine learning offers a promising approach to construct such potentials by fitting energies and forces to high-level quantum-mechanical data, but doing so typically requires considerable human intervention and data volume. Here we show that, by leveraging hierarchical and active learning, accurate Gaussian Approximation Potential (GAP) models can be developed for diverse chemical systems in an autonomous manner, requiring only hundreds to a few thousand energy and gradient evaluations on a reference potential-energy surface. The approach uses separate intra- and inter-molecular fits and employs a prospective error metric to assess the accuracy of the potentials. We demonstrate applications to a range of molecular systems with relevance to computational organic chemistry: ranging from bulk solvents, a solvated metal ion and a metallocage onwards to chemical reactivity, including a bifurcating Diels–Alder reaction in the gas phase and non-equilibrium dynamics (a model SN2 reaction) in explicit solvent. The method provides a route to routinely generating machine-learned force fields for reactive molecular systems.

Highlights

Molecular simulations are a cornerstone in computational chemistry, providing dynamical insights beyond experimental resolution.[1]
The initial step in validating supervised machine learning (ML) tends to follow the splitting of a dataset into training and test sets, training the model, evaluating its performance on the test set with a squared error (RMSE/MSE) or a correlation (R2) metric
In an Machine learning (ML) potential, the minimum required domain of applicability is the region of con guration space likely to be sampled during a simulation with the potential

Summary

Introduction

Molecular simulations are a cornerstone in computational chemistry, providing dynamical insights beyond experimental resolution.[1]. Empirical interatomic potentials (force elds), in combination with molecular dynamics (MD) or Monte Carlo (MC) simulations, have been widely used to sample the potentialenergy surface (PES). They are limited in accuracy and transferability.[2] most of these potentials are parameterised for isolated entities with xed connectivity and unable to describe bond breaking/forming processes. Ab initio methods provide an accurate description of the PES, which is critical for reactions in solution Because of their high computational cost and unfavourable scaling behaviour, they are limited to a few hundred atoms and simulation times of picoseconds in ab initio

Methods

Results

Conclusion