Abstract

Sample outlier detection is imperative before calculating a multivariate calibration model. Outliers, especially in high-dimensional space, can be difficult to detect. The outlier measures Hotelling's t-squared, Q-residuals, and Studentized residuals are standard in analytical chemistry with spectroscopic data. However, these and other merits are tuning parameter dependent and sensitive to the outlier themselves, i.e., the measures are susceptible to swamping and masking. Additionally, different samples are also often flagged as outliers depending on the outlier measure used. Sum of ranking differences (SRD) is a new generic fusion tool that can simultaneously evaluate multiple outlier measures across windows of tuning parameter values thereby simplifying outlier detection and providing improved detection. Presented in this paper is SRD to detect multiple outliers despite the effects of masking and swamping. Both spectral (x-outlier) and analyte (y-outlier) outliers can be detected separately or in tandem with SRD using respective merits. Unique to SRD are fusion verification processes to confirm samples flagged as outliers. The SRD process also allows for sample masking checks. Presented, and used by SRD, are several new outlier detection measures. These measures include atypical uses of Procrustes analysis and extended inverted signal correction (EISC). The methodologies are demonstrated on two near-infrared (NIR) data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call