Abstract

For high-throughput screening of materials for heterogeneous catalysis, scaling relations provides an efficient scheme to estimate the chemisorption energies of hydrogenated species. However, conditioning on a single descriptor ignores the model uncertainty and leads to suboptimal prediction of the chemisorption energy. In this article, we extend the single descriptor linear scaling relation to a multi-descriptor linear regression models to leverage the correlation between adsorption energy of any two pair of adsorbates. With a large dataset, we use Bayesian Information Criteria (BIC) as the model evidence to select the best linear regression model. Furthermore, Gaussian Process Regression (GPR) based on the meaningful convolution of physical properties of the metal-adsorbate complex can be used to predict the baseline residual of the selected model. This integrated Bayesian model selection and Gaussian process regression, dubbed as residual learning, can achieve performance comparable to standard DFT error (0.1 eV) for most adsorbate system. For sparse and small datasets, we propose an ad hoc Bayesian Model Averaging (BMA) approach to make a robust prediction. With this Bayesian framework, we significantly reduce the model uncertainty and improve the prediction accuracy. The possibilities of the framework for high-throughput catalytic materials exploration in a realistic setting is illustrated using large and small sets of both dense and sparse simulated dataset generated from a public database of bimetallic alloys available in Catalysis-Hub.org.

Highlights

  • Mean-field microkinetic models—developed by combining electronic structure properties with macroscopic reaction parameters, such as reaction temperature and pressure1,2—are used to obtain fundamental insights into the reaction kinetics occurring on the solid/gas interfaces

  • To tackle the evidence problem, we propose Bayesian information criteria (BIC) as the model evidence to select the best model that optimizes the bias-variance trade-off[29,30], the approach described in detail later in this manuscript

  • To address the prediction problem in small datasets, we propose Bayesian model averaging (BMA) to be a robust solution where instead of choosing a single linear regression model, we use a small set of the best models to come up with a better prediction[28,30,31]

Read more

Summary

Introduction

Mean-field microkinetic models—developed by combining electronic structure properties with macroscopic reaction parameters, such as reaction temperature and pressure1,2—are used to obtain fundamental insights into the reaction kinetics occurring on the solid/gas interfaces. A popular approach to high-throughput computational discovery of heterogeneous catalytic materials is a descriptor based approach[9,10] where suitable descriptors, e.g., d–band center, width, etc., is chosen to efficiently compute the chemisorption energy of all the reaction intermediates without performing a full DFT computation To this end, Nørskov and Hammer, proposed a simplified theory for adsorbate bonding on transition metal surfaces based on the electronic interaction of adsorbate sp-band with the metal d-band[11,12,13]. One of the major breakthrough in computational catalysis and surface science research came about when Abild-Pedersen and co-workers identified a linear scaling relation to determine the adsorption energy relying only on the adsorbate valency (sp–band of the adsorbate) together with metallic d–band properties[3] Due to these underlying mechanisms, linear scaling relationships are found between the adsorption energy of similar species. For any molecular fragment AHx, the adsorption energy is generally linearly correlated with the adsorption energy of A, which can be expressed mathematically as, ΔEAHx 1⁄4 γΔEA þ ξ (1)

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call