Abstract

Spectroscopic data generated by several PAT technologies is routinely used for the rapid assessment of quality properties in several industrial sectors, such as agrofood, beverages, pharmaceutics, chemicals, pulp & paper, etc. While spectra can easily provide hundreds of measurements across several wavelengths, only a fraction of the collected spectrum conveys relevant information to predict the property of interest. Therefore, the performance of current models is highly related with the ability to select key wavebands, for which the existence of prior knowledge cannot be always secured. Therefore, several feature selection procedures consisting of variants of interval partial least squares (iPLS) have been proposed. These methodologies are however solely focused on determining the most relevant wavebands and do not attempt to further enhance the prediction capabilities within each interval. On the other hand, standard full-spectrum models are often improved by reducing the spectral resolution, but this operation has not been yet synergistically integrated together with waveband selection. As spectral aggregation can effectively improve modelling performance, a multiresolution selection algorithm that simultaneously selects the most relevant wavebands and their optimal resolution is here proposed. By design, this methodology leads to prediction models that are at least as good as the full-spectrum models. The performance comparison made on simulated data and real NIR spectra of gasoline samples also shows that the proposed methodology outperforms iPLS and its variants based on forward and backward selection of intervals in a statistically significant way.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call