Abstract

The identification of descriptors of materials properties and functions that capture the underlying physical mechanisms is a critical goal in data-driven materials science. Only such descriptors will enable a trustful and efficient scanning of materials spaces and possibly the discovery of new materials. Recently, the sure-independence screening and sparsifying operator (SISSO) has been introduced and was successfully applied to a number of materials-science problems. SISSO is a compressed sensing based methodology yielding predictive models that are expressed in form of analytical formulas, built from simple physical properties. These formulas are systematically selected from an immense number (billions or more) of candidates. In this work, we describe a powerful extension of the methodology to a ‘multi-task learning’ approach, which identifies a single descriptor capturing multiple target materials properties at the same time. This approach is specifically suited for a heterogeneous materials database with scarce or partial data, e.g. in which not all properties are reported for all materials in the training set. As showcase examples, we address the construction of materials properties maps for the relative stability of octet-binary compounds, considering several crystal phases simultaneously, and the metal/insulator classification of binary materials distributed over many crystal prototypes.

Highlights

  • The materials-genome initiative[1] inspired the establishment of several high-throughput computational materials-science projects, leading to the creation of worldwide accessible materials databases[2,3,4,5]

  • MT-screening and sparsifying operator (SISSO) for the relative stability of different structure pairs of AB binary materials In Refs.[20,24,25] the learning of the relative stability between the rock-salt (RS) and zincblende (ZB) structures of AB octet binary compounds was used as showcase study

  • More information on these high-throughput density-functional theory (DFT) calculations can be found in Ref. 45 and all inputs and outputs are in the Novel Materials Discovery (NOMAD) repository

Read more

Summary

INTRODUCTION

The materials-genome initiative[1] inspired the establishment of several high-throughput computational materials-science projects, leading to the creation of worldwide accessible materials databases[2,3,4,5]. Learning models for the prediction of the relative stability of more than two crystal structures, given the same set of chemical formulas, can be cast into MTL. A second setup where MT-SISSO is helpful is the learning of one common property of many materials belonging to physically different groups, e.g., they have different bonding characteristics and their ground-state crystal structure belong to different space groups. In such situation one single predictive model is difficult to be found.

Single-task SISSO for continuous property
Multi-task SISSO for learning continuous properties
MT-SISSO for categorical properties
Computational complexity of SISSO
RESULTS AND DISCUSSION
Dimension o3f the descripto4r
A dAA dA2B
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call