Least Squares Means Multiple Comparison Testing of Reference versus Predicted Residuals for Evaluation of Partial Least Squares Spectral Calibrations

J.B Reeves,V.B Reeves,S.R Delwiche

doi:10.1255/jnirs.663

Abstract

It is common to use data pre-treatments such as scatter correction, derivatives, mean centring and variance scaling prior to the development of near- and mid-infrared spectral calibrations. As a result, it is possible to generate a multitude of calibrations, many of which will have similar statistical properties, as measured by the coefficient of determination ( R2) and the residual error. With respect to validation data sets, calibration equations have a tendency to provide the most optimistic modelling statistics on the set of data to which they were developed; however, the pre-treatment that was optimal for one set of samples may not be the best for future samples and, therefore, not the most robust calibration. If several calibrations are found to be statistically the same, then other criteria could be used to determine which one to use (for example, one with fewest partial least squares (PLS) factors or based on past experience) or further investigations could be carried out on the more limited set of calibrations deemed to represent the best of all those originally developed. However, there has been no single accepted statistical procedure for determining which calibrations are statistically the same and which are not from a large group of calibrations. This study describes the use of least squares means multiple comparisons testing of squared reference-versus-predicted residuals for determining the statistical similarity of multiple PLS calibrations. A program has been developed using common commercial statistical software (SAS, Mixed procedure) which computes and summarises comparisons of PLS calibrations. This method is also applicable to other multivariate regression methods as it only requires a list of reference and predicted values from each calibration as input.

Full Text