Modeling of nitrogen solubility in normal alkanes using machine learning methods compared with cubic and PC-SAFT equations of state

Seyed Ali Madani,Mohammad-Reza Mohammadi,Saeid Atashrouz,Ali Abedi,Ahmad Mohaddespour,Abdolhossein Hemmati-Sarapardeh

doi:10.1038/s41598-021-03643-8

Abstract

Accurate prediction of the solubility of gases in hydrocarbons is a crucial factor in designing enhanced oil recovery (EOR) operations by gas injection as well as separation, and chemical reaction processes in a petroleum refinery. In this work, nitrogen (N2) solubility in normal alkanes as the major constituents of crude oil was modeled using five representative machine learning (ML) models namely gradient boosting with categorical features support (CatBoost), random forest, light gradient boosting machine (LightGBM), k-nearest neighbors (k-NN), and extreme gradient boosting (XGBoost). A large solubility databank containing 1982 data points was utilized to establish the models for predicting N2 solubility in normal alkanes as a function of pressure, temperature, and molecular weight of normal alkanes over broad ranges of operating pressure (0.0212–69.12 MPa) and temperature (91–703 K). The molecular weight range of normal alkanes was from 16 to 507 g/mol. Also, five equations of state (EOSs) including Redlich–Kwong (RK), Soave–Redlich–Kwong (SRK), Zudkevitch–Joffe (ZJ), Peng–Robinson (PR), and perturbed-chain statistical associating fluid theory (PC-SAFT) were used comparatively with the ML models to estimate N2 solubility in normal alkanes. Results revealed that the CatBoost model is the most precise model in this work with a root mean square error of 0.0147 and coefficient of determination of 0.9943. ZJ EOS also provided the best estimates for the N2 solubility in normal alkanes among the EOSs. Lastly, the results of relevancy factor analysis indicated that pressure has the greatest influence on N2 solubility in normal alkanes and the N2 solubility increases with increasing the molecular weight of normal alkanes.

Highlights

Accurate prediction of the solubility of gases in hydrocarbons is a crucial factor in designing enhanced oil recovery (EOR) operations by gas injection as well as separation, and chemical reaction processes in a petroleum refinery
The findings demonstrate that both SRK and perturbed-chain statistical associating fluid theory (PC-SAFT) equations of state (EOSs) estimate the experimentally observed values with reasonable accuracy[41]
To find the best model in each aforementioned algorithm, a routine procedure has been done to find the hyperparameters and the other functional features of each model. Since these models have been implemented in python, different libraries including scikit-learn for k-nearest neighbors (k-NN) and Random forest[110], xgboost for XGBoost, lightgbm for L ightGBM98, and c atboost[99] for Catboost have been employed in this study