Molecular dynamics (MD) simulations can reduce our need for experimental tests and provide detailed insight into the chemical reactions and binding kinetics. There are two challenges while dealing with MD simulations: one is the time and length scale limitations, and the latter is efficiently processing the massive amount of data resulting from the MD simulations and generating the proper reaction rates. In this work, we evaluated the use of regression machine learning (ML) methods to solve these two challenges by developing a framework for ethanol adsorption on an Aluminium (Al) slab. This framework comprises three main stages: first, an all-atom molecular dynamics model; second, ML regression models; and third, validation and testing. In stage one, the adsorption of ethanol molecules on the Al surface for various temperatures, velocities and concentrations is simulated using the large-scale atomic/molecular massively parallel simulator (LAMMPS) and ReaxFF. The outcome of stage one is utilised for training, testing, and validating the predictive models in stages two and three. We developed and evaluated 28 different ML models for predicting the number of adsorbed molecules over time, including linear regression, support vector machine (SVM), decision trees, ensemble, Gaussian process regression (GPR), neural network (NN) and Bayesian hyper-parameter optimisation models. Based on the results, the Bayesian-based GPR showed the highest accuracy and the lowest training time. The developed model can predict the number of adsorbed molecules for new cases within seconds, while MD simulations take a few weeks. This adsorption rate can then be used in macroscale simulations to tackle the time and length scale limitations. The proposed numerical framework has the potential to be generalised and, therefore, contribute to future low-cost binding reaction estimations, providing a valuable tool for industry and experimentalists.
Read full abstract