Machine-learning models for predicting adsorption energies on metallic surfaces often rely on basic elemental properties and electronic and geometric descriptors. Here, we apply categorical entity embedding, a featurization method inspired by natural language processing techniques, to predict adsorption energies on bimetallic alloy surfaces using categorical descriptors. Using this method, we develop a machine-learned representation from categorical descriptors (e.g., surface composition, adsorbate type, and site type) of the slab/adsorbate complex. By combining this representation with numerical features (e.g., slab metal stoichiometric ratios), we create the CatEmbed representation. Remarkably, decision tree models trained using CatEmbed, which includes no explicit geometric information, achieve a Mean Absolute Error (MAE) of 0.12 eV. Additionally, we extend this technique to predict reaction energies on bimetallic surfaces, creating the CatEmbed-React representation, which achieves an MAE of 0.08 eV. These findings highlight the effectiveness of categorical entity embedding for predicting adsorption and reaction energies on bimetallic alloy surfaces.
Read full abstract