Abstract

High-throughput spectra data with large number of variables (wavelength) will make the prediction of multivariate calibration model unreliable, in which case, the sparse statistical methods such as least absolute shrinkage and selection operator (LASSO) are gradually being valued by researchers. In this study, a novel variable informative criterion based on weighted voting strategy combined with least absolute shrinkage and selection operator (WV-LASSO) has been proposed. Monte Carlo Sampling (MCS) is used for generating a large number of sub-models. In each Monte Carlo circulation, the regression coefficients and variable selection information of LASSO model will be recorded. In the present work, weighted voting strategy based on regression coefficients information combined with selected variable frequency of all sub-models is used for evaluating the importance of variable. Different from specific methods, variable informative (importance) criterion can be more extensive and flexible for algorithm design. Then an approach called exponentially decreasing function (EDF) is applied to create a variable selection method with WV-LASSO. The performance of this method was evaluated by three near-infrared (NIR) datasets. Compared with some efficient variable selection methods based on different informative criterions including variable importance projection (VIP), Monte Carlo uninformative variable elimination (MC-UVE), randomization test (RT), competitive adaptive reweighted sampling (CARS), stability competitive adaptive reweighted sampling (SCARS), variable iterative space shrinkage approach (VISSA), interval variable iterative space shrinkage approach (iVISSA), LASSO coupled with sampling error profile analysis (SEPA-LASSO) and variable combination population analysis (VCPA), and so forth, the variable selection method proposed in this paper shows better prediction and interpretation ability and has potential for constructing various variable selection methods by combining other selection strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call