Abstract

Subgroup discovery (SGD) is presented here as a data-mining approach to help find interpretable local patterns, correlations, and descriptors of a target property in materials-science data. Specifically, we will be concerned with data generated by density-functional theory calculations. At first, we demonstrate that SGD can identify physically meaningful models that classify the crystal structures of 82 octet binary (OB) semiconductors as either rocksalt or zincblende. SGD identifies an interpretable two-dimensional model derived from only the atomic radii of valence s and p orbitals that properly classifies the crystal structures for 79 of the 82 OB semiconductors. The SGD framework is subsequently applied to 24 400 configurations of neutral gas-phase gold clusters with 5–14 atoms to discern general patterns between geometrical and physicochemical properties. For example, SGD helps find that van der Waals interactions within gold clusters are linearly correlated with their radius of gyration and are weaker for planar clusters than for nonplanar clusters. Also, a descriptor that predicts a local linear correlation between the chemical hardness and the cluster isomer stability is found for the even-sized gold clusters.

Highlights

  • Rational design of advanced functional materials, e.g., active and selective catalysts [1], efficient thermoelectrics [2], and high-temperature superconductors [3], requires an understanding of the underlying fundamental physical mechanisms

  • The application of big-data analytics to obtain material insights and to predict novel materials can be enhanced by the availability of large materials repositories, e.g., AFLOWLIB, Computational Materials Repository, Electronic Structure Project, Materials Project, Novel Materials Discovery (NOMAD), Open Quantum Materials Database (OQMD), and Pauling file [26]

  • Our objective is to develop and exploit big-data analytics tools to discover materials insights and to predict advanced materials from large collections of materials data stored within the NOMAD Archive [27]

Read more

Summary

Introduction

Rational design of advanced functional materials, e.g., active and selective catalysts [1], efficient thermoelectrics [2], and high-temperature superconductors [3], requires an understanding of the underlying fundamental physical mechanisms. In this paper we demonstrate a multipurpose data-mining algorithm called subgroup discovery (SGD) to identify and describe local patterns, correlations, and descriptors in materials-science data according to some desired target property (or properties) [29,30,31,32]. We demonstrate that SGD can identify physically meaningful models that classify the crystal structures of 82 octet binary semiconductors as either rocksalt (RS) or zincblende (ZB) from only information of its chemical composition. The aim of investigating gold clusters here is two-fold: (1) to search for general structure-property relationships holding across gold clusters of different sizes and vastly different configurations; and (2) to demonstrate the versatility of subgroup discovery on a large and heterogeneous dataset. It is established that SGD can help identify unexpected and general, size-independent, patterns within the dataset of gold cluster configurations

Subgroup Discovery
Subgroup Quality
Search Strategy
Opportunistic pruning
Octet Binary Semiconductors
Neutral Gas-Phase Gold Clusters
Finding patterns of the HOMO-LUMO energy gap
Analyzing relationships between chemical hardness and cluster stability
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call