Abstract

The length and time scales of atomistic simulations are limited by the computational cost of the methods used to predict material properties. In recent years there has been great progress in the use of machine-learning algorithms to develop fast and accurate interatomic potential models, but it remains a challenge to develop models that generalize well and are fast enough to be used at extreme time and length scales. To address this challenge, we have developed a machine-learning algorithm based on symbolic regression in the form of genetic programming that is capable of discovering accurate, computationally efficient many-body potential models. The key to our approach is to explore a hypothesis space of models based on fundamental physical principles and select models within this hypothesis space based on their accuracy, speed, and simplicity. The focus on simplicity reduces the risk of overfitting the training data and increases the chances of discovering a model that generalizes well. Our algorithm was validated by rediscovering an exact Lennard-Jones potential and a Sutton-Chen embedded-atom method potential from training data generated using these models. By using training data generated from density functional theory calculations, we found potential models for elemental copper that are simple, as fast as embedded-atom models, and capable of accurately predicting properties outside of their training set. Our approach requires relatively small sets of training data, making it possible to generate training data using highly accurate methods at a reasonable computational cost. We present our approach, the forms of the discovered models, and assessments of their transferability, accuracy and speed.

Highlights

  • The development of an interatomic potential model is treated as a supervised learning problem,[21] in which an optimization algorithm is used to search a hypothesis space of possible functions to find those that best reproduce the energies, forces, and possibly other properties of a set of training data

  • Having established that our genetic programming algorithm can find the exact form of simple pair and many-body potentials, we evaluated its ability to find potential models from data generated using density functional theory[52] (DFT)

  • We tested its ability to rediscover the exact form of two interatomic potentials: the Lennard-Jones potential and the Sutton-Chen (SC) embedded-atom method (EAM) potential

Read more

Summary

Introduction

There have been great advances in the use of machine learning to develop interatomic potential models.[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] In this approach, the development of an interatomic potential model is treated as a supervised learning problem,[21] in which an optimization algorithm is used to search a hypothesis space of possible functions to find those that best reproduce the energies, forces, and possibly other properties of a set of training data. Potential models developed in this way are often able to achieve accuracy close to that of the method used to generate the training data, with linear scalability and orders of magnitude increase in performance. Potential models may be generated by using fundamental physical relationships to derive a simple parameterized function. The parameters of this function are typically fit to a smaller set of training data. Examples of potential models generated using this latter approach include the embedded-atom method (EAM) and bond-order potentials.[22,23,24,25,26,27,28]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call