The escalating consumption of fossil fuels has given rise to a substantial upsurge in greenhouse gas concentrations and global temperatures, which, in turn, has triggered severe climate-related consequences. The critical imperative to reduce CO2 emissions and combat global warming has spurred extensive investigations into clean energy alternatives, with hydrogen emerging as a compelling zero-emission energy source. As a pivotal component of clean energy strategies, hydrogen requires designing compact, lightweight, and efficient storage systems. This study focuses on the development and evaluation of machine learning models for predicting the efficiency of Metal-Organic Frameworks (MOFs) in hydrogen storage, a key aspect of advancing clean energy technologies. MOFs, a class of nanoporous materials, show remarkable potential for hydrogen storage due to their high surface area and porosity. However, selecting the most suitable MOF for this application from a vast array of possible structures is a daunting task. In this context, machine learning algorithms offer an efficient alternative for predicting MOF suitability by considering their structural and chemical properties. We used ensemble learning methods, specifically Light Gradient Boosting Machine (LightGBM) and Random Forest (RF), to predict hydrogen uptake of MOFs based on a dataset of 219 experimentally tested samples. Two modeling scenarios were considered: one using the entire dataset, and the other involving strategic data pre-processing, including outlier removal and feature engineering. The results demonstrate that the measures taken to refine the dataset significantly enhance the predictive performance of the developed models, reducing prediction errors and improving overall goodness of fit. Specifically, the Mean Absolute Error (MAE) values for both the LightGBM and random forest models were reduced from 0.48 and 0.94, respectively, to 0.16 for both models, and the coefficients of determination (R2) increased substantially from 0.84 and 0.72 to 0.95, in both cases. Moreover, feature importance analysis unveiled that pressure-related features make the most significant contributions to the formation of tree ensembles during the model training process. A parametric sensitivity analysis was conducted revealing that H2 uptake in MOFs is most sensitive to changes in adsorption enthalpy, followed by surface area and temperature, while showing lower sensitivity to variations in pressure, consistent with established literature. These results underscore the pivotal role of data enhancement methods in refining machine learning models and can be instrumental in accelerating the development and optimization of MOF materials for clean energy applications.
Read full abstract