Abstract
Predictive molecular simulations require fast, accurate and reactive interatomic potentials. Machine learning offers a promising approach to construct such potentials by fitting energies and forces to high-level quantum-mechanical data, but doing so typically requires considerable human intervention and data volume. Here we show that, by leveraging hierarchical and active learning, accurate Gaussian Approximation Potential (GAP) models can be developed for diverse chemical systems in an autonomous manner, requiring only hundreds to a few thousand energy and gradient evaluations on a reference potential-energy surface. The approach uses separate intra- and inter-molecular fits and employs a prospective error metric to assess the accuracy of the potentials. We demonstrate applications to a range of molecular systems with relevance to computational organic chemistry: ranging from bulk solvents, a solvated metal ion and a metallocage onwards to chemical reactivity, including a bifurcating Diels–Alder reaction in the gas phase and non-equilibrium dynamics (a model SN2 reaction) in explicit solvent. The method provides a route to routinely generating machine-learned force fields for reactive molecular systems.
Highlights
Molecular simulations are a cornerstone in computational chemistry, providing dynamical insights beyond experimental resolution.[1]
The initial step in validating supervised machine learning (ML) tends to follow the splitting of a dataset into training and test sets, training the model, evaluating its performance on the test set with a squared error (RMSE/MSE) or a correlation (R2) metric
In an Machine learning (ML) potential, the minimum required domain of applicability is the region of con guration space likely to be sampled during a simulation with the potential
Summary
Molecular simulations are a cornerstone in computational chemistry, providing dynamical insights beyond experimental resolution.[1]. Empirical interatomic potentials (force elds), in combination with molecular dynamics (MD) or Monte Carlo (MC) simulations, have been widely used to sample the potentialenergy surface (PES). They are limited in accuracy and transferability.[2] most of these potentials are parameterised for isolated entities with xed connectivity and unable to describe bond breaking/forming processes. Ab initio methods provide an accurate description of the PES, which is critical for reactions in solution Because of their high computational cost and unfavourable scaling behaviour, they are limited to a few hundred atoms and simulation times of picoseconds in ab initio
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have