Abstract

The calibration performance of partial least squares for one response variable (PLS1) can be improved by elimination of uninformative variables. Many methods are based on so-called predictive variable properties, which are functions of various PLS-model parameters, and which may change during the variable reduction process. In these methods variable reduction is made on the variables ranked in descending order for a given variable property. The methods start with full spectrum modelling. Iteratively, until a specified number of remaining variables is reached, the variable with the smallest property value is eliminated; a new PLS model is calculated, followed by a renewed ranking of the variables. The Stepwise Variable Reduction methods using Predictive-Property-Ranked Variables are denoted as SVR-PPRV. In the existing SVR-PPRV methods the PLS model complexity is kept constant during the variable reduction process. In this study, three new SVR-PPRV methods are proposed, in which a possibility for decreasing the PLS model complexity during the variable reduction process is build in. Therefore we denote our methods as PPRVR-CAM methods (Predictive-Property-Ranked Variable Reduction with Complexity Adapted Models). The selective and predictive abilities of the new methods are investigated and tested, using the absolute PLS regression coefficients as predictive property. They were compared with two modifications of existing SVR-PPRV methods (with constant PLS model complexity) and with two reference methods: uninformative variable elimination followed by either a genetic algorithm for PLS (UVE-GA-PLS) or an interval PLS (UVE-iPLS). The performance of the methods is investigated in conjunction with two data sets from near-infrared sources (NIR) and one simulated set. The selective and predictive performances of the variable reduction methods are compared statistically using the Wilcoxon signed rank test. The three newly developed PPRVR-CAM methods were able to retain significantly smaller numbers of informative variables than the existing SVR-PPRV, UVE-GA-PLS and UVE-iPLS methods without loss of prediction ability. Contrary to UVE-GA-PLS and UVE-iPLS, there is no variability in the number of retained variables in each PPRV(R) method. Renewed variable ranking, after deletion of a variable, followed by remodelling, combined with the possibility to decrease the PLS model complexity, is beneficial. A preferred PPRVR-CAM method is proposed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.