A novel supervised classification algorithm, direct clustering in n-dimensional space (DCNS), was developed for difficult data sets where conventional methods of supervised clustering are expected to fail. The method is based, when applied on >3-dimensional spaces, on an algorithm that performs special treatment on the measurement space, so that the treated space can allow a computer-aided clustering methodology similar to that used by human vision. However, unlike other techniques that reduce the dimensionality of the space, the proposed method preserves the original dimensions while performing a computer-simulated human vision clustering in the original n-dimensional space. Thus the overlap between clusters that results from the dimensionality reduction is eliminated. The proposed method was applied to two real data sets. The results are compared with those obtained using principal component analysis (PCA), an artificial neural network (ANN), and the k-nearest-neighbor (KNN) technique. On one data set containing only two clusters, the DCNS algorithm gives better cluster separation than the other three methods. However, when all four methods were applied on the second data set, containing eight different clusters, PCA, ANN and KNN were unable to give useful cluster separation, while the DCNS method was able to separate all clusters and classify the unknown points successfully with their corresponding clusters. The DCNS technique is able to perform other important cluster analysis tasks, such as testing the discriminatory power of a variable, selecting one variable from many, and conducting preliminary unsupervised clustering. Copyright © 2000 John Wiley & Sons, Ltd.
Read full abstract