Abstract

In this paper, we extend the concept of cross model validation (CMV) to multiple X and Y variables where different spectroscopic techniques serve as X and Y data in a regression context. For the first dataset on marzipan samples the main objective was to find significant regions in the spectral data, and to discuss the issue of false discovery, i.e. combinations of variables that erroneously are found to be significant. A permutation test within the framework of CMV showed that no regression coefficients in the partial least squares regression (PLSR) model between FT-IR and VIS/NIR spectra show significance at the 5% level. We believe the reason is that the CMV acts as strong filter towards spurious correlations. Corresponding CH- and OH-bands between FT-IR and NIR spectra gave significant regions. For the second dataset, the results from CMV are interpreted more in detail with chemical background knowledge in mind. Most of the significant regions found between the Raman and NIR spectra could be interpreted from the chemical composition of the oil mixtures. Some regions were more difficult to interpret, which could be due to systematic baseline effects in the NIR data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call