Abstract
Through molecular simulation, feature analysis and extraction, as well as the modeling and optimization of machine learning algorithms, we have developed exceptional regression prediction models for the high-throughput screening of organic framework materials in CF4/N2, C2F6/N2, and SF6/N2 separation. The Grand Canonical Ensemble Monte Carlo method was employed to simulate the adsorption behavior of these three gas mixtures in 603 organic framework materials at room temperature 298 K and different pressures, constructing a dataset suitable for subsequent machine learning studies. We analyzed the impact of six common structural features of adsorbents (PLD, LCD, Density, ASA, AVF, and AV) on separation performance, determining the optimal ranges of each structural feature for adsorption separation in different gas mixtures. Additionally, we introduced two custom descriptors (AVG_SIG, AVG_SQRT_EPS) to describe adsorbent force field parameters, revealing their significant correlation with adsorption separation performance. Using pressure, adsorbent structural features, and custom descriptors as features, and TSQ value representing adsorption separation performance as the target, we applied eight machine learning models based on linear regression (MLR, RR), decision tree (DT, RF, GBDT, XGBoost), and neural network (MLP, GN) principles to model and predict on three datasets. Results indicated that simple models struggle to reliably predict adsorption separation performance, while structurally complex machine models demonstrate significant potential. We utilized the Harris Hawks Optimization (HHO) algorithm to perform hyperparameter optimization on multiple machine learning models and introduced improvements to the GN network. The optimized models, especially XGBoost and GN, exhibited outstanding performance, significantly enhancing the accuracy of predicting adsorption separation.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have