Abstract

Variable selection can improve the robustness and prediction accuracy of partial least squares (PLS) regression models and decrease the calculation time by selecting the optimal subset of variables in multivariate calibration. In this study, the performance of two variable selection methods for wavelength interval and individual wavelength coupled with partial least squares regression are investigated by employing the experimental data of asiaticoside (AS) and madecassoside (MS) contents in centella total glucosides (CTG) and a public dataset of corn. The studied variable selection methods include interval partial least squares regression ( iPLS), backward interval partial least squares ( biPLS), synergy interval partial least squares regression ( siPLS), competitive adaptive reweighted sampling (CARS), uninformative variable elimination (UVE) and variable importance in projection (VIP). The results show that the implementation of variable selection methods improved the performance of the model compared with full-spectrum modeling. All variable selection methods improved the prediction of AS or MS contents in CTG. When latent variables for PLS models are less than 10 in the practical application, the RPD value of AS models by iPLS method is 7.5, and the RPD value of MS models by biPLS method is 2.9. The results of wavelength interval selection are better than individual wavelength selection, especially for iPLS and biPLS. The same results were obtained with the public data for moisture in corn, and the RPD value of biPLS model of moisture is 1.6. Therefore, the wavelength interval selection methods, such as iPLS or biPLS, are appropriate for improving the PLS model’s accuracy and robustness to determine the target components’ contents in solid samples.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call