Active learning (AL) can drastically accelerate materials discovery; its power has been shown in various classes of materials and target properties. Prior efforts have used machine learning models for the optimal selection of physical experiments or physics-based simulations. However, the latter efforts have been mostly limited to the use of electronic structure calculations and properties that can be obtained at the unit cell level and with negligible noise. We couple AL with molecular dynamics simulations to identify multiple principal component alloys (MPCAs) with high melting temperatures. Building on cloud computing services through nanoHUB, we present a fully autonomous workflow for the efficient exploration of the high dimensional compositional space of MPCAs. We characterize how uncertainties arising from the stochastic nature of the simulations and the acquisition functions used to select simulations affect the convergence of the approach. Interestingly, we find that relatively short simulations with significant uncertainties can be used to efficiently find the desired alloys as the random forest models used for AL average out fluctuations.
Read full abstract