Benchmarking the acceleration of materials discovery by sequential learning.

Brian Rohr,Helge S Stein,Dan Guevarra,Yu Wang,Joel A Haber,Muratahan Aykol,Santosh K Suram,John M Gregoire

doi:10.1039/c9sc05999g

Brian Rohr, Helge S Stein + Show 6 more

Open Access

PDF Available

https://doi.org/10.1039/c9sc05999g

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Sequential learning (SL) strategies, i.e. iteratively updating a machine learning model to guide experiments, have been proposed to significantly accelerate materials discovery and research. Applications on computational datasets and a handful of optimization experiments have demonstrated the promise of SL, motivating a quantitative evaluation of its ability to accelerate materials discovery, specifically in the case of physical experiments. The benchmarking effort in the present work quantifies the performance of SL algorithms with respect to a breadth of research goals: discovery of any “good” material, discovery of all “good” materials, and discovery of a model that accurately predicts the performance of new materials. To benchmark the effectiveness of different machine learning models against these goals, we use datasets in which the performance of all materials in the search space is known from high-throughput synthesis and electrochemistry experiments. Each dataset contains all pseudo-quaternary metal oxide combinations from a set of six elements (chemical space), the performance metric chosen is the electrocatalytic activity (overpotential) for the oxygen evolution reaction (OER). A diverse set of SL schemes is tested on four chemical spaces, each containing 2121 catalysts. The presented work suggests that research can be accelerated by up to a factor of 20 compared to random acquisition in specific scenarios. The results also show that certain choices of SL models are ill-suited for a given research goal resulting in substantial deceleration compared to random acquisition methods. The results provide quantitative guidance on how to tune an SL strategy for a given research goal and demonstrate the need for a new generation of materials-aware SL algorithms to further accelerate materials discovery.

Highlights

Accelerating materials discovery is of utmost importance for realization of several emergent technologies, to combat climate change through the adoption of zero or negative emission technologies such as hydrogen driven cars and other means of clean chemical energy generation, storage and utilization
The compendium of simulated learning results indicate that (i) exploration by uncertainty-based sample selection can accelerate the establishment of predictive models in niche situations where a substantial fraction of the search space is measured, random experiment selection is typically a suitable strategy; (ii) EFs and accelerate factor (AF) up to approximately 20Â are possible for identifying any or all top catalysts, demonstrating a ceiling for the extent by which sequential learning can improve catalyst discovery; (iii) EF and AF values well below 0.05 are observed, indicating that the oor for deleterious effects of sequential learning is relatively deep compared to the ceiling
Poor choices for machine learning (ML) model and/or acquisition function for a given experiment budget or research object can lead to substantially worse performance than random sample selection, a critical lesson that illustrates the importance of comprehensive work ow design in the context of speci c research objectives.[2]

Summary

Introduction

Model guides experiment at each iteration based on pre-sampled data is a promising approach to accelerate materials research. The SL framework is designed to enable facile variation in both the machine learning model and acquisition function and is implemented under the assumptions of a discretized search space that represents all possible experiments, which we refer to as the sample set of size N. The SL cycle i results in the measurement of the FOM for a newly-selected point in the search space, thereby increasing the size of the training set to i + 1 samples This SL technique can be implemented with any machine learning model that provides a predicted FOM value and uncertainty of that prediction for each input coordinate. This variation is visualized by plotting the median value as well as shaded regions representing the 6th to 94th percentile, i.e. removing the top 2 and bottom 2 values from each set of 50 values for xALMi

Results

Discussion

Data and code availability

Conclusions

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Chemical science	Publication Date: Jan 1, 2020
Citations: 96	License type: CC BY 3.0

R Discovery Prime

Benchmarking the acceleration of materials discovery by sequential learning.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Chemical science

Lead the way for us

Similar Papers

Machine-Learning-Accelerated Perovskite Crystallization
Jeffrey Kirman ... Edward H Sargent
Matter | VOL. 2
Jeffrey Kirman, et. al.Jeffrey Kirman ... Edward H Sargent
10 Mar 2020
Matter | VOL. 2

Machine learning with knowledge constraints for process optimization of open-air perovskite solar cell manufacturing
Zhe Liu ... Austin C Flick
Joule | VOL. 6
Zhe Liu, et. al.Zhe Liu ... Austin C Flick
01 Apr 2022
Joule | VOL. 6

Sequential and parallel neural network vector quantizers
K.K Parhi ... K Genesan
IEEE Transactions on Computers | VOL. 43
K.K Parhi, et. al.K.K Parhi ... K Genesan
01 Jan 1993
IEEE Transactions on Computers | VOL. 43

Improved online sequential extreme learning machine for identifying crack behavior in concrete dam
Bo Dai ... Kai Zhu
Advances in Structural Engineering | VOL. 22
Bo Dai, et. al.Bo Dai ... Kai Zhu
25 Jul 2018
Advances in Structural Engineering | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Benchmarking the acceleration of materials discovery by sequential learning.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Chemical science