Model stacking to improve prediction and variable importance robustness for soft sensor development

Maxwell Barton,Barry Lennox

doi:10.1016/j.dche.2022.100034

Abstract

This paper presents an interpretable ensemble modelling method, in which the predictions of several individual base learners are combined together through Stacked generalisation, which makes use of a secondary layer model, or so called meta-learner, that is trained on the output cross-validation predictions of each base learner. To provide interpretability, the permutation variable importance (PVI) is computed on the ensemble, wherein variables are randomly shuffled and the reduction in predictive performance for the ensemble is calculated for each variable. This is a novel contribution, as no previous attempts have been made in the soft sensor literature to investigate the interpretability of ensemble models that use heterogeneous base learners. The Stacked ensemble model also avoids model selection, which is the process of choosing among many candidate models. Model selection is often based on cross-validation, which is not guaranteed to select the best model in terms of true generalisation performance on the test set. Instead, the proposed method combines multiple models instead of choosing a singular model, avoiding the need for model selection. The efficacy of the proposed methodology in terms of both variable importance and predictive performance is shown on a synthetic dataset, in which the variable importance is already known, and an industrial dataset of a refinery process provided by Dow. For the synthetic dataset, it is shown that the proposed method chooses the correct casual variables, whereas the in-built variable importance provided by the individual models, namely Partial least squares, Lasso, Random forests & XGBoost, can give increased importance to non-causal, randomly generated variables. For the industrial study, the combined ensemble is shown to outperform all individual base models in terms of predictive performance, whilst also providing a new perspective in terms of variable importance compared to previous studies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Digital Chemical Engineering	Publication Date: May 26, 2022
Citations: 14	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Model stacking to improve prediction and variable importance robustness for soft sensor development

Abstract

Talk to us

Similar Papers

More From: Digital Chemical Engineering

Lead the way for us

Similar Papers

Ensemble hologram quantitative structure activity relationship model of the chromatographic retention index of aldehydes and ketones
Bin Lei ... Long Jiao
Se pu = Chinese journal of chromatography | VOL. 39
Bin Lei, et. al.Bin Lei ... Long Jiao
01 Mar 2021
Se pu = Chinese journal of chromatography | VOL. 39

Estimating visibility and understanding factors influencing its variations at Bangkok airport using machine learning and a game theory-based approach.
Nishit Aman ... Yangjun Wang
Environmental science and pollution research international | VOL. -
Nishit Aman, et. al.Nishit Aman ... Yangjun Wang
05 Aug 2024
Environmental science and pollution research international | VOL. -

Mapping high-resolution XCO2 concentrations in China from 2015 to 2020 based on spatiotemporal ensemble learning model
Weican Liu ... Meigen Zhang
Ecological Informatics | VOL. 83
Weican Liu, et. al.Weican Liu ... Meigen Zhang
30 Aug 2024
Mapping high-resolution XCO2 concentrations in China from 2015 to 2020 based on spatiotemporal ensemble learning model
Weican Liu ... Meigen Zhang

Performance Comparison of Individual and Ensemble CNN Models for the Classification of Brain 18F-FDG-PET Scans.
Tomomi Nobashi ... Guido A Davidzon
Journal of digital imaging | VOL. 33
Tomomi Nobashi, et. al.Tomomi Nobashi ... Guido A Davidzon
28 Oct 2019
Journal of digital imaging | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Model stacking to improve prediction and variable importance robustness for soft sensor development

Abstract

Talk to us

Similar Papers

More From: Digital Chemical Engineering