Learning What Makes Catalysts Good

Nongnuch Artrith

doi:10.1016/j.matt.2020.09.012

Abstract

Machine learning has proven a powerful tool for accelerating the computational characterization of energy materials. The neural-network approach by Lu et al. allows bypassing time-consuming first principles calculations for the design of catalysts based on high-entropy alloys. This work is an example of a growing number of case studies identifying descriptors of catalytic performance using machine learning instead of physical intuition. Machine learning has proven a powerful tool for accelerating the computational characterization of energy materials. The neural-network approach by Lu et al. allows bypassing time-consuming first principles calculations for the design of catalysts based on high-entropy alloys. This work is an example of a growing number of case studies identifying descriptors of catalytic performance using machine learning instead of physical intuition. Heterogeneous catalysis is at the core of many essential processes in chemical industry, such as the production of ammonia for agriculture (Haber-Bosch process), methanol, and other commodity chemicals. Because preparing and experimentally testing catalysts at operation conditions is expensive and time consuming, computational approaches have long been developed that can (often) quantitatively predict reaction pathways and thereby catalytic activities and selectivities from first principles.1Nørskov J.K. Bligaard T. Rossmeisl J. Christensen C.H. Towards the computational design of solid catalysts.Nat. Chem. 2009; 1: 37-46Crossref PubMed Scopus (2432) Google Scholar Materials classes that allow for systematic tuning, such as high-entropy alloys (HEA),2Lu Z. Chen Z.W. Singh C.V. Neural Network-Assisted Development of High-Entropy Alloy Catalysts: Decoupling Ligand and Coordination Effects.Matter. 2020; 3 (this issue): 1318-1333Abstract Full Text Full Text PDF Scopus (19) Google Scholar,3Batchelor T.A.A. Pedersen J.K. Winther S.H. Castelli I.E. Jacobsen K.W. Rossmeisl J. High-Entropy Alloys as a Discovery Platform for Electrocatalysis.Joule. 2019; 3: 834-845Abstract Full Text Full Text PDF Scopus (161) Google Scholar offer in principle a rich opportunity space for in silico catalyst discovery by first-principles computational screening. However, heterogeneous catalysis is a deceivingly simple problem, and an exhaustive search based on first principles methods considering the true complexity of catalytic processes is infeasible in practice. Catalytic activity, selectivity, and the stability of catalysts depend not only on the composition of the catalyst material but also on its atomic structure, particle size and morphology, and defects such as vacancies, off-stoichiometries, step edges, etc. In combination with high-dimensional composition spaces of, for example, HEAs, performing first-principles calculations of all possibilities becomes very challenging, owing to the high computational cost of such calculations. To avoid this curse of dimensionality, the entire complexity of the catalyst material is usually not modeled. Instead, much research effort has been dedicated to identifying descriptors that correlate with the catalytic performance and can be obtained more easily (Figure 1).Figure 1Machine Learning (ML) Can Identify Correlations beyond Known Physics and IntuitionShow full captionPhysics/intuition-based descriptors of the binding energy (BE), such as the Newns-Anderson model (d-band model) or scaling relations, are commonly used to accelerate computational catalyst discovery. ML models can detect more general correlations between the structural and chemical features of a catalyst and its catalytic properties, providing powerful new descriptors for materials discovery. The interpretation of non-linear ML models remains an important research question.View Large Image Figure ViewerDownload (PPT) Physics/intuition-based descriptors of the binding energy (BE), such as the Newns-Anderson model (d-band model) or scaling relations, are commonly used to accelerate computational catalyst discovery. ML models can detect more general correlations between the structural and chemical features of a catalyst and its catalytic properties, providing powerful new descriptors for materials discovery. The interpretation of non-linear ML models remains an important research question. Sabatier’s principle tells us that the binding energy of reaction intermediates on the catalyst surface must be neither too strong nor too weak for optimal activity. To avoid calculating all molecular binding energies from first principles, descriptors are often considered: the Newns-Anderson model4Newns D.M. Self-Consistent Model of Hydrogen Chemisorption.Phys. Rev. 1969; 178: 1123-1135Crossref Scopus (1359) Google Scholar relates the binding energy on transition metal surfaces to the energy of the electronic d-band states relative to the Fermi level and thus provides a physics-based electronic-structure feature that correlates with the binding energy (d-band model). Following chemical intuition, it is also found that the binding energies of single oxygen (O∗) and carbon (C∗) atoms are proportional to the binding energies of molecular species that coordinate with O or C atoms to the catalyst surface, providing simple scaling relations for estimating binding energies.5Jones G. Bligaard T. Abild-Pedersen F. Nørskov J.K. Using scaling relations to understand trends in the catalytic activity of transition metals.J. Phys. Condens. Matter. 2008; 20: 064239Crossref PubMed Scopus (109) Google Scholar The d-band center and the O∗/C∗ binding energies are useful features for the prediction of binding energies that were identified by considering known physical interactions and chemical intuition. Recent work indicates that more general and more accurate descriptors can be found when machine learning (ML) is employed as a substitute for human intuition.6Artrith N. Machine Learning for the Modeling of Interfaces in Energy Storage and Conversion Materials.J. Phys. Energy. 2019; 1: 032002Crossref Scopus (22) Google Scholar, 7Li Z. Ma X. Xin H. Feature Engineering of Machine-Learning Chemisorption Models for Catalyst Design.Catal. Today. 2017; 280: 232-238Crossref Scopus (95) Google Scholar, 8Ulissi Z.W. Tang M.T. Xiao J. Liu X. Torelli D.A. Karamad M. Cummins K. Hahn C. Lewis N.S. Jaramillo T.F. et al.Machine-Learning Methods Enable Exhaustive Searches for Active Bimetallic Facets and Reveal Active Site Motifs for CO2 Reduction.ACS Catal. 2017; 7: 6600-6608Crossref Scopus (201) Google Scholar, 9Kolsbjerg E.L. Peterson A.A. Hammer B. Neural-Network-Enhanced Evolutionary Algorithm Applied to Supported Metal Nanoparticles.Phys. Rev. B. 2018; 97: 195424Crossref Scopus (51) Google Scholar, 10Artrith N. Lin Z. Chen J.G. Predicting the Activity and Selectivity of Bimetallic Metal Catalysts for Ethanol Reforming Using Machine Learning.ACS Catal. 2020; 10: 9438-9444Crossref Scopus (25) Google Scholar Lu et al.2Lu Z. Chen Z.W. Singh C.V. Neural Network-Assisted Development of High-Entropy Alloy Catalysts: Decoupling Ligand and Coordination Effects.Matter. 2020; 3 (this issue): 1318-1333Abstract Full Text Full Text PDF Scopus (19) Google Scholar trained an artificial neural network (NN) model on first-principles data for the prediction of OH∗ binding energies on IrPdPtRhRu HEAs reaching an accuracy that is close to the intrinsic error of first principles calculations (∼0.1 eV). The model inputs are features that are related to the geometry of the adsorption site (coordination numbers) and the local chemical composition (element group and period as well as atomic radii). Unlike the conventional scaling relations, the NN model by Lu et al. yields quantitative predictions for multiple crystal facets including step edges and corner sites, and it is able to capture the variation of the binding energy with changes in the composition and ordering of surface atoms. The NN model is orders of magnitude more computationally efficient than first principles calculations and facilitated the comparison of a large number of HEA structures that would not have been possible otherwise. This work is one example of a growing number of studies that demonstrate how ML may reveal previously unknown non-linear correlations, allowing for the prediction of binding energies3Batchelor T.A.A. Pedersen J.K. Winther S.H. Castelli I.E. Jacobsen K.W. Rossmeisl J. High-Entropy Alloys as a Discovery Platform for Electrocatalysis.Joule. 2019; 3: 834-845Abstract Full Text Full Text PDF Scopus (161) Google Scholar,7Li Z. Ma X. Xin H. Feature Engineering of Machine-Learning Chemisorption Models for Catalyst Design.Catal. Today. 2017; 280: 232-238Crossref Scopus (95) Google Scholar,8Ulissi Z.W. Tang M.T. Xiao J. Liu X. Torelli D.A. Karamad M. Cummins K. Hahn C. Lewis N.S. Jaramillo T.F. et al.Machine-Learning Methods Enable Exhaustive Searches for Active Bimetallic Facets and Reveal Active Site Motifs for CO2 Reduction.ACS Catal. 2017; 7: 6600-6608Crossref Scopus (201) Google Scholar of reaction intermediates and even transition state energies9Kolsbjerg E.L. Peterson A.A. Hammer B. Neural-Network-Enhanced Evolutionary Algorithm Applied to Supported Metal Nanoparticles.Phys. Rev. B. 2018; 97: 195424Crossref Scopus (51) Google Scholar,10Artrith N. Lin Z. Chen J.G. Predicting the Activity and Selectivity of Bimetallic Metal Catalysts for Ethanol Reforming Using Machine Learning.ACS Catal. 2020; 10: 9438-9444Crossref Scopus (25) Google Scholar of individual reaction steps. The models in the literature differ in the ML methods and the choice of features that were used as inputs, and further improvement both in accuracy and transferability can be expected as the scientific community explores further variations over the coming years. In addition to constructing models of properties relevant for catalysis, ML has also been used to accelerate first principles calculations facilitating the simulation of non-idealized catalyst surfaces, nanoparticles, and interfaces.6Artrith N. Machine Learning for the Modeling of Interfaces in Energy Storage and Conversion Materials.J. Phys. Energy. 2019; 1: 032002Crossref Scopus (22) Google Scholar ML also comes with its own challenges. One challenge is the construction and validation of ML models, which generally involves:(1)compiling a high-quality training dataset;(2)optimizing hyperparameters using an independent validation dataset;(3)quantifying prediction errors using independent testing data; and(4)ensuring that the test set is representative for the intended application. A prerequisite for the construction of ML models is (1) the availability of accurate and comprehensive datasets. The dataset generated by Lu et al. consisted, for example, of ∼1,400 first-principles calculations of different alloy surfaces.2Lu Z. Chen Z.W. Singh C.V. Neural Network-Assisted Development of High-Entropy Alloy Catalysts: Decoupling Ligand and Coordination Effects.Matter. 2020; 3 (this issue): 1318-1333Abstract Full Text Full Text PDF Scopus (19) Google Scholar If the ML model depends on hyperparameters such as the NN architecture, they are ideally optimized for (2) a validation dataset that is independent from the training set. A ML model is only as good as the data that it was trained on, and careful validation on (3) independent test data is therefore of utmost importance. This test set must be (4) representative for the intended application in order to provide a meaningful error estimate. For catalysis applications, model validation ideally involves comparison with experiment.2Lu Z. Chen Z.W. Singh C.V. Neural Network-Assisted Development of High-Entropy Alloy Catalysts: Decoupling Ligand and Coordination Effects.Matter. 2020; 3 (this issue): 1318-1333Abstract Full Text Full Text PDF Scopus (19) Google Scholar,10Artrith N. Lin Z. Chen J.G. Predicting the Activity and Selectivity of Bimetallic Metal Catalysts for Ethanol Reforming Using Machine Learning.ACS Catal. 2020; 10: 9438-9444Crossref Scopus (25) Google Scholar In the absence of human intuition, another challenge of ML models pertains to their interpretation. Complex non-linear models often turn out black boxes that yield the desired answer (e.g., the binding energy) but offer limited insight into their working. Lu et al. addressed this question by constructing a second, linear model that is transparent to interpretation.2Lu Z. Chen Z.W. Singh C.V. Neural Network-Assisted Development of High-Entropy Alloy Catalysts: Decoupling Ligand and Coordination Effects.Matter. 2020; 3 (this issue): 1318-1333Abstract Full Text Full Text PDF Scopus (19) Google Scholar However, the linear model is less accurate than the non-linear NN model, and it remains an interesting research question how to design interpretable accurate non-linear ML models for catalysis applications that can truly replace (or extend) the intuitive insight that conventional approaches such as the Newns-Anderson model or the scaling relations offered. Neural Network-Assisted Development of High-Entropy Alloy Catalysts: Decoupling Ligand and Coordination EffectsLu et al.MatterAugust 18, 2020In BriefHigh-entropy alloys (HEA) provide vast chemical space that can be fine-tuned to create optimal heterogeneous catalysts. We leverage the predictive power of neural network models to accurately predict the adsorption properties of HEA surfaces, simultaneously accounting for different metal elements (ligand effect) and different crystal structures and defects (coordination effect). Full-Text PDF Open Archive

Full Text