Determination of gross calorific value in crude oil by variable selection methods applied to 13C NMR spectroscopy

Ellisson H De Paulo,Francine D Dos Santos,Gabriely S Folli,Layla P Santos,Márcia H.C Nascimento,Mariana K Moro,Pedro H.P Da Cunha,Eustáquio V.R Castro,Alvaro Cunha Neto,Paulo R Filgueiras

doi:10.1016/j.fuel.2021.122527

Ellisson H De Paulo, Francine D Dos Santos + Show 8 more

https://doi.org/10.1016/j.fuel.2021.122527

Copy DOI

Journal: Fuel	Publication Date: Nov 12, 2021
Citations: 11

Affiliation: Universidade Federal do Espírito Santo

Abstract

Gross Calorific Value (GCV) is one of the properties to assess the quality and value of fuel in the oil industry, but the standard method is laborious. Regression models built with nuclear magnetic resonance (13C NMR) data make it possible to estimate different physicochemical properties of oils. However, its adversity is the enormous amount of chemical information produced in a single spectrum. And not all variables contribute in the model. With variable selection methods (VS), we can find the information that has the highest correlation with the property of interest. In our study, we used different methods applied to 13C NMR data of 145 Brazilian crude oil samples with GCV ranging from 41.5 to 47 MJ∙kg−1. For variable selection we used genetic algorithm (GA), variable importance in projection (VIP), uninformative variable elimination (UVE), angular search algorithm with variance inflation factor (ASA-VIF), competitive adaptive reweighted sampling (CARS), synergy interval partial least squares (siPLS), interval partial least squares (iPLS), subwindow permutation analysis (SPA), ordered predictors selection (OPS) and particle swarm optimization (PSO). Also, we used orthogonal projection to latent structures (OPLS) and partial least squares (PLS) with full spectra (PLS-full). All models using variable selection obtained lower root mean squared error of prediction (RMSEP) compared to the model based on full range (PLS-full). PSO was the most accurate model with RMSEP of 0.152 MJ∙kg−1. PSO selected the paraffinic region (0 to 4.43 ppm), totalizing 701 variables. Statistical analysis showed no trends in residues above 5% of significance for the best models.

Full Text