The electron affinity (EA) and ionization energy (IE) of battery electrolyte molecules are pivotal in determining the electrochemical properties of electrode materials, significantly impacting the performance and safety of batteries. In this study, we employed a suite of machine learning algorithms, including linear regression, gradient boosting, CatBoost, XGBoost, and random forest regression, to predict these properties. Our analysis leveraged a comprehensive dataset of 24432 electrolyte molecules extracted from the Joint Center for Energy Storage Research (JCESR) Molecules database, and used RDKit and Mordred tools to generate a series of molecular descriptors. Through meticulous feature engineering and selection using the maximum relevance minimum redundancy (mRMR) algorithm, we identified the most influential molecular features for predicting EA and IE. The CatBoost model emerged as the most accurate, outperforming other models with its ability to handle complex nonlinear relationships and provide robust predictions with lower errors. The model's predictions were further validated through 10-fold cross-validation, demonstrating its generalization capability and resistance to overfitting. The CatBoost model achieved a root mean squared error (RMSE) of 0.239 eV for IE and 0.427 eV for EA, with R-squared (R2) values of 0.944 and 0.879, respectively. SHapley Additive exPlanations (SHAP) value analysis elucidated the contribution of each feature to the model's predictions, highlighting the molecular characteristics that significantly influence EA and IE. Our findings offer valuable insights for the design of novel electrolyte materials with tailored electronic properties, advancing the development of high-performance battery technologies.
Read full abstract