Abstract

AbstractThe paper describes a new approach for disjoint hard modelling of classes. This involves developing independent PC models for each group in the class, and calculating both the Q statistic (square prediction error) for each sample to the class model and a separate statistic about how well samples are classified within the projected PC space. The latter statistic can be applied to different types of classifiers, in this paper we choose to illustrate by Quadratic Discriminant Analysis (D statistic) and one class Support Vector Domain Description (SVDD) (f‐value). The two measures (Q and the classifier dependent statistic) are combined into a joint decision function which uniquely classifies each sample. The disjoint hard models are contrasted to conjoint models where PCA is performed on the entire dataset using both QDA and Support Vector Machines (SVMs) classifiers. The optimum number of PCs for each model is determined using the bootstrap, and model performance assessed on 100 test sets obtained using different iterative splits, using %PA (Predictive Ability) and %CR (Classification Rate). The method is illustrated using a dataset consisting of 293 samples from nine groups of polymers obtained using thermal profiling. The approach described, in this paper, has many of the advantages of one class disjoint models (e.g. SIMCA) and of conventional hard models, and is useful if it is known that all samples must belong to one of a series of known groups but each group has a very different structure. Copyright © 2010 John Wiley & Sons, Ltd.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call