Abstract

Variable selection plays a pivotal role in the quantitative analysis of near-infrared (NIR) spectra with large number of variables and relatively few samples. In this study, a novel algorithm, namely variable permutation population analysis (VPPA) which combines variable permutation, model population analysis (MPA) and exponentially decreasing function (EDF), was proposed for variable selection to improve the prediction performance in multivariate spectral calibration. This method builds a large number of sub-datasets by Monte Carlo sampling (MCS) strategy in both sample space and variable space firstly, and the importance of each variable is subsequently evaluated using the difference value order of the corresponding partial least squares (PLS) model prediction error before and after the variable permutation. Next, EDF is applied to eliminate the relatively uninformative variables by force. Ultimately, cross validation is utilized to choose the optimal variable subset. A complete methodology for variable selection is constructed through the above four procedures. Three near infrared (NIR) datasets were presented to illustrate the proposed method and evaluate its performance. While PLS is used as the modeling method, the results reveal that VPPA is a potential variable selection method which shows better prediction performance when compared with conventional PLS, subwindow permutation analysis PLS (SPA-PLS), Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS), competitive adaptive reweighted sampling PLS (CARS-PLS) and genetic algorithm PLS (GA-PLS). Moreover, the proposed approach employs fewer variables than these variable optimization methods mentioned above. Therefore, the VPPA technique can be recommended for practical implementation in multivariate calibration of NIR spectra.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call