Abstract

Wavelength selection is an important step in NIR data analysis as it yields more robust and accurate models for predicting products’ properties. This paper proposes a novel framework for wavelength selection in Partial Least Squares (PLS) models based on the absorbance interquartile range. The framework first divides samples into quartiles according to the response variable. The average quartile absorbance value is calculated for each wavelength; the distance in averages from samples belonging to the first and fourth quartiles is then used to generate a wavelength importance ranking, such that wavelengths with larger differences are ranked higher and expected to better explain variations in the response variable. The ranking guides the iterative inclusion of wavelength intervals into PLS models, starting with the interval presenting the largest average absorbance differences. The proposed framework was tested in three public datasets comprised of 10 response variables describing diesel, corn and soil properties, substantially reducing prediction error and the percentage of wavelengths included in the PLS models. When compared to four competing wavelength selection methods from the literature, the proposed framework yielded the best results in 7 out of 10 response variables in the training set, and in 6 out of 10 response variables in the testing set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call