Abstract
Virtual materials screening approaches have proliferated in the past decade, driven by rapid advances in first-principles computational techniques, and machine-learning algorithms. By comparison, computationally driven materials synthesis screening is still in its infancy, and is mired by the challenges of data sparsity and data scarcity: Synthesis routes exist in a sparse, high-dimensional parameter space that is difficult to optimize over directly, and, for some materials of interest, only scarce volumes of literature-reported syntheses are available. In this article, we present a framework for suggesting quantitative synthesis parameters and potential driving factors for synthesis outcomes. We use a variational autoencoder to compress sparse synthesis representations into a lower dimensional space, which is found to improve the performance of machine-learning tasks. To realize this screening framework even in cases where there are few literature data, we devise a novel data augmentation methodology that incorporates literature synthesis data from related materials systems. We apply this variational autoencoder framework to generate potential SrTiO3 synthesis parameter sets, propose driving factors for brookite TiO2 formation, and identify correlations between alkali-ion intercalation and MnO2 polymorph selection.
Highlights
We compare (1) the unmodified canonical synthesis features, which include descriptors such as heating temperatures or solvent concentrations, (2) canonical features modified by linear dimensionality reduction with principal component analysis (PCA), and (3) canonical features modified by non-linear dimensionality reduction using a variational autoencoder (VAE)
This implies that the data compressed via PCA has lost information critical to predicting the target synthesized material associated with each set of synthesis parameters, and provides us with a baseline performance against which to compare the non-linear VAE method for feature representation learning
We develop an additional machine-learning model to validate the quality of the learned SrTiO3 VAE latent space
Summary
To accelerate the design and realization of novel materials, a number of recent studies have screened promising candidates across a variety of categories, including light-emitting molecules,[1] perovskite compounds,[2,3,4,5] catalysts,[6,7] thermoelectrics,[8,9,10,11,12] and metal-organic frameworks.[13,14] the rise of virtual materials screening, along with high-throughput first-principles computations and experimentation, has resulted in the creation of numerous accessible databases for the materials science community.[15,16,17,18,19,20,21,22] There is, a pressing need for analogous virtual screening of inorganic materials syntheses to complement the growing volume of predicted and screened compounds.[23,24] Such synthesis screening approaches have found recent success in organic chemistry, where a wealth of tabulated reaction data is available,[25,26,27,28,29,30,31,32,33,34,35] and synthesis parameter screening, driven by machine learning, has been explored for the specific case of organically templated metal vanadium selenites.[20]. A computational synthesis screening framework is presented in which a variational autoencoder (VAE) neural network is used to learn compressed synthesis representations from sparse descriptors, and a novel data augmentation approach is developed to enable this framework for materials with uncommon syntheses.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.