Abstract
Root mean square error of prediction (RMSEP) is widely used as a criterion for judging the performance of a multivariate calibration model; often it is even the sole criterion. Two methods are discussed for estimating the uncertainty in estimates of RMSEP. One method follows from the approximate sampling distribution of mean square error of prediction (MSEP) while the other one is based on performing error propagation, which is a distribution-free approach. The results from a small Monte Carlo simulation study suggest that, provided that extreme outliers are removed from the test set, MSEP estimates are approximately proportional to a χ 2 random variable with n degrees of freedom, where n is the number of samples in the test set. It is detailed how this knowledge can be used to determine the size of an adequate test set. The advantages over the guideline issued by the American Society for Testing and Materials (ASTM) are discussed. The expression derived by the method of error propagation is shown to systematically overestimate the true uncertainty. A correction factor is introduced to ensure approximate correct behaviour. A close agreement is found between the uncertainties calculated using the two complementary methods. The consequences of using a too small test set are illustrated on a practical data set.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have