Abstract

Predicting the solubility of hydrogen (H2) in aqueous solutions is crucial for studying reactions of hydrogen in the formation, which also affects the security and optimal design of hydrogen storage. In this research, five robust machine learning (ML) algorithms, namely adaptive boosting decision tree (AdaBoost-DT), adaptive boosting support vector regression (AdaBoost-SVR), gradient boosting decision tree (GB-DT), gradient boosting support vector regression (GB-SVR), and k-nearest neighbors (KNN) and three powerful white-box techniques, namely gene expression programming (GEP), genetic programming (GP), and group method of data handling (GMDH) were developed to accurately predict H2 solubility in pure and saline water systems. To this aim, a widespread databank containing 427 experimental data points was collected, and temperature, pressure, and salt concentration (mSalt) were considered as input variables. The validity and precision of the developed models were assessed utilizing several statistical and graphical tests. Results demonstrate that the AdaBoost-SVR smart model could obtain a superior performance and provides precise predictions with root mean square error (RMSE) of 0.000115 and determination coefficient (R2) of 0.9973. Among the white-box models, the GEP provided the best results with an RMSE of 0.000362 and an R2 of 0.9542. Although the accuracy of GEP is slightly lower than that of AdaBoost-SVR, it offers explicit and simple mathematical formula for calculating H2 solubility, which is the main advantage of white box models. The results also demonstrated that AdaBoost-SVR outperforms cubic equations of state (EOSs) such as Peng-Robinson (PR), Redlich-Kwong (RK), Soave-Redlich-Kwong (SRK), and Zudkevitch-Joffe (ZJ). Besides, trend analysis showed that AdaBoost-SVR model could match actual trends of H2 solubility change versus temperature and pressure. Finally, outlier detection analysis using the Leverage technique indicated that the majority of data points used for modeling (nearly 94 %) are reliable and placed in the valid zone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call