Abstract

The methodology and mathematical treatment of several classic multivariate methods for the analysis of spectroscopic data is demonstrated in a straightforward way that can be used as a basis for teaching an undergraduate introductory course on chemometric analysis. The multivariate techniques of classical least-squares (CLS), principal component regression (PCR), and partial least-squares (PLS), as well as the univariate Beer’s law method have been described and compared, building students’ understanding by starting with the univariate method and progressing step by step into the multivariate methods. Equations for the production of regression vectors from training set spectral data are described and their use demonstrated for the prediction of constituent concentrations on a separate validation set of spectra. Extreme care is taken to ensure consistency in variable formatting of data matrices. This provides a key foundation to understand how spectral data are manipulated using these different mathematical approaches for building quantitative regression models. Each method is applied to a real-world data set, and the results are discussed to show students the types of information that can be gleaned from each method. A training set comprising 20 infrared absorbance spectra containing 3 constituents (benzene, polystyrene, and gasoline) of known composition are used to demonstrate the matrix operations for each regression method. A separate set of 12 real-world napalm samples (containing benzene, polystyrene, and gasoline) are used as a validation set to demonstrate the ability to utilize the regression models on an unknown data set. A toolbox (PNNL Chemometric Toolbox) written in MATLAB language is supplied in the Supporting Information and can be used as a companion for understanding the development and deployment of the chemometric algorithms described in this paper. The data sets of the infrared spectra are also supplied, allowing users to build and inspect the chemometric models on their own. Finally, the Toolbox includes scripts to assist users in loading their own data sets into MATLAB and performing CLS, PCR, and PLS on their data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call