Abstract
The identification of descriptors of materials properties and functions that capture the underlying physical mechanisms is a critical goal in data-driven materials science. Only such descriptors will enable a trustful and efficient scanning of materials spaces and possibly the discovery of new materials. Recently, the sure-independence screening and sparsifying operator (SISSO) has been introduced and was successfully applied to a number of materials-science problems. SISSO is a compressed sensing based methodology yielding predictive models that are expressed in form of analytical formulas, built from simple physical properties. These formulas are systematically selected from an immense number (billions or more) of candidates. In this work, we describe a powerful extension of the methodology to a ‘multi-task learning’ approach, which identifies a single descriptor capturing multiple target materials properties at the same time. This approach is specifically suited for a heterogeneous materials database with scarce or partial data, e.g. in which not all properties are reported for all materials in the training set. As showcase examples, we address the construction of materials properties maps for the relative stability of octet-binary compounds, considering several crystal phases simultaneously, and the metal/insulator classification of binary materials distributed over many crystal prototypes.
Highlights
The materials-genome initiative[1] inspired the establishment of several high-throughput computational materials-science projects, leading to the creation of worldwide accessible materials databases[2,3,4,5]
MT-screening and sparsifying operator (SISSO) for the relative stability of different structure pairs of AB binary materials In Refs.[20,24,25] the learning of the relative stability between the rock-salt (RS) and zincblende (ZB) structures of AB octet binary compounds was used as showcase study
More information on these high-throughput density-functional theory (DFT) calculations can be found in Ref. 45 and all inputs and outputs are in the Novel Materials Discovery (NOMAD) repository
Summary
The materials-genome initiative[1] inspired the establishment of several high-throughput computational materials-science projects, leading to the creation of worldwide accessible materials databases[2,3,4,5]. Learning models for the prediction of the relative stability of more than two crystal structures, given the same set of chemical formulas, can be cast into MTL. A second setup where MT-SISSO is helpful is the learning of one common property of many materials belonging to physically different groups, e.g., they have different bonding characteristics and their ground-state crystal structure belong to different space groups. In such situation one single predictive model is difficult to be found.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have