Abstract
Various methods exist for reconstructing the molecular composition of petroleum feedstocks from their bulk properties. While data-driven approaches are precise and efficient, they often lack mechanistic insight. This paper presents an interpretable, data-driven model for naphtha composition reconstruction. Utilizing a property-based Extreme Gradient Boosting (XGBoost) model, optimized with the Tree Parzen Estimator (TPE) and property mixing rules, we achieve notable accuracy. The model leverages Shapley Additive Explanations (SHAP) to elucidate the influence of each property on specific compositions. Moreover, we introduce a compositional-weighted SHAP metric, revealing overarching molecular distribution patterns. Our analyses show that PIONA values and boiling points have a more pronounced effect on molecular compositions than other examined properties. Finally, the SOL-CNN model is employed for accurate property prediction of predefined components.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have