Optimization and comparison of machine learning methods in estimation of carbon dioxide loading in chemical solvents for environmental applications

Liang Chen,Huan Huang,Lakshmi Thangavelu,Walid Kamal Abdelbasset,Dmitry Olegovich Bokov,Mohammed Algarni,Sami Ghazali,May Alashwal

doi:10.1016/j.molliq.2022.118513

Abstract

In this study, we developed a variety of machine learning ensemble models for predicting and correlating CO2 solubility in amino acid salt solutions containing different concentrations. The models were utilized to establish a relationship between process parameters and CO2 loading in the solvent. Indeed, the solitary model output was the amount of CO2 that was loaded into and dissolved in the chemical solvent. When it came to selecting estimators, we tried three different approaches to correlate the CO2 loading. Bagging and boosting models, both of which are subclasses of ensemble techniques are used in these models. When using ensemble techniques, a number of weak models are combined to build a strong and robust model for prediction of solubility values. There are a variety of models that are utilized including random forests (RF), extreme randomized trees (ERT), and boosted K-NN (with Adaboost). We repeated the procedure multiple times in order to obtain the best model, from which we could then establish the right hyper-parameters for each one of the models. Following optimization, the R2 scores for all three models above 0.9, suggesting that the models had high predictive performance. ERT had the highest R2 score, which was 0.999, among all companies. R2 of 0.992 was achieved by Random Forest, also we have Boosted KNN, which achieved an R2 of 0.998.

Full Text