Active Learning Training Strategy for Predicting O Adsorption Free Energy on Perovskite Catalysts using Inexpensive Catalyst Features

Shambhawi Shambhawi,Gábor Csányi,Alexei A Lapkin

doi:10.1002/cmtd.202100035

Shambhawi Shambhawi, Gábor Csányi + Show 1 more

Open Access

https://doi.org/10.1002/cmtd.202100035

Copy DOI

Abstract

AbstractMachine learning (ML) based energy prediction models are among the most effective descriptor‐based catalyst screening tools for heterogeneous reaction systems. However, their implementations are limited due to expensive data labelling, ab initio feature evaluation and lack of universal catalyst features, that is, beyond d‐band theory. Herein, we propose an inexpensive geometric feature for application on systems beyond d‐band theory, for example perovskites comprising of s‐, p‐, d‐ and f‐block elements. We outline a workflow that inputs these features into an active learning algorithm that enables effective data labelling, whilst improving prediction accuracies of existing models. We then use batch sampling to define termination criteria and to implement time‐series error forecasting for further reducing the number of expensive data labelling for training. We implement this workflow to train ML models for predicting oxygen adsorption free energy on perovskites and achieve similar, if not better, prediction accuracies as obtained from ab initio features.

Highlights

A catalyst‘s performance for a given reaction system can be predicted using microkinetic models that are based on ab initio methods like density functional theory (DFT).[1]
We propose linear muffin-tin orbital theory (LMTO)[9] based geometric features for catalysts that can be extended to materials beyond intermetallics
It was found that the mean root mean square error (RMSE)/ MAE remains unchanged as we increase the number of random test/train splits from 100 to 500

Summary

Introduction

Of a descriptor for more than a thousand of catalysts, including different facets and adsorption sites, still requires significant computational time and resources. For the active learning technique, we implement a committee-based query strategy, that is, maximum disagreement[15] and the expected error reduction strategy.[16] We outline a workflow that uses time-series forecasting to predict RMSE of unlabeled data based on past iterations This workflow can be implemented to screen catalysts from search spaces with thousands of possible candidates and can be extended to other reaction chemistries.

Results and Discussion

O Adsorption Free Energy Prediction Models

Active Learning Based Prediction Models

Batch RMSE Forecasting

Conclusions

Conflict of Interest