AbstractWe present a hybrid similarity kernel that exemplifies the integration of short‐ and long‐range descriptors via the use of an average kernel approach. This technique allows for a direct measure of the similarity between amorphous configurations, and when combined with an active learning (AL) spectral clustering approach, it leads to a classification of the amorphous configurations into uncorrelated clusters. Subsequently, a minimum size database is built by considering a small fraction of configurations belonging to each cluster and a machine learning interatomic potential (MLIP), within the Gaussian approximation scheme, is fitted by relying on a Bayesian optimization of the potential hyperparameters. This step is embedded within an AL loop that allows to sequentially increase the size of the learning database whenever the MLIP fails to meet a predefined energy convergence threshold. As such, MLIP are fitted in an almost fully automatized fashion. This approach is tested on two diverse amorphous systems that were previously generated using first‐principles molecular dynamics. Accurate potentials with less than 2 meV/atom root mean square energy error compared to the reference data are obtained. This accuracy is achieved with only 175 configurations sampling the studied systems at various temperatures. The robustness of these potentials is then confirmed by producing models with several thousands of atoms featuring a good agreement with reference ab initio and experimental data.
Read full abstract