Abstract

Near infrared (NIR) spectra contain information regarding the analyte as well as uninformative wavelengths. To build high-performance data-driven models, key wavelengths with a strong correlation to the analyte must be selected. This study proposes a feature selection method called stepwise Bayesian linear regression (SBLR) for eliminating unrelated wavelengths, thereby enhancing the robustness of the constructed model. First, a random wavelength is selected from an optimal variable set, and the other wavelengths are placed in a candidate variable set. A Bayesian linear regression (BLR) is implemented by adding a new variable from the candidate set or removing a variable from the optimal set in each step. Furthermore, the BLR model is utilized to perform the F-test. Comparing with the critical value of the F-test with a significance level of α, the test determines whether the variable is retained in the optimal set. Finally, the extracted variables are used to construct a BLR model. The performance and generalization ability of the proposed method were validated. The physical explanation of extracted wavelengths is consistent with the perspective of chemical analysis based on the experiment, which provides a good understanding of the collected NIR spectral data. In addition, compared with traditional algorithms, such as partial least squares regression, least absolute shrinkage and selection operator, and stepwise regression, the proposed method reserves only a few of the effective wavelengths from the full NIR spectra. The proposed method demonstrates potential for key wavelength selection in NIR spectroscopy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call