Interpretable reconstruction of naphtha components using property-based extreme gradient boosting and compositional-weighted Shapley additive explanation values

Yi Shi,Weimin Zhong,Xin Peng,Minglei Yang,Wei Du

doi:10.1016/j.ces.2023.119462

Yi Shi, Weimin Zhong + Show 3 more

https://doi.org/10.1016/j.ces.2023.119462

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Various methods exist for reconstructing the molecular composition of petroleum feedstocks from their bulk properties. While data-driven approaches are precise and efficient, they often lack mechanistic insight. This paper presents an interpretable, data-driven model for naphtha composition reconstruction. Utilizing a property-based Extreme Gradient Boosting (XGBoost) model, optimized with the Tree Parzen Estimator (TPE) and property mixing rules, we achieve notable accuracy. The model leverages Shapley Additive Explanations (SHAP) to elucidate the influence of each property on specific compositions. Moreover, we introduce a compositional-weighted SHAP metric, revealing overarching molecular distribution patterns. Our analyses show that PIONA values and boiling points have a more pronounced effect on molecular compositions than other examined properties. Finally, the SOL-CNN model is employed for accurate property prediction of predefined components.

Full Text