Abstract

It was posited that a good cluster solution has two characteristics: (1) it is stable across multiple random samples; and (2) its clusters accurately correspond to the populations from which the sample of data comes. A technique for evaluating a minimum variance cluster solution (Ward, 1963) was presented which deals with these two characteristics of a solution. In this technique, which follows the cross-validation paradigm, two data sets were drawn from a population mixture. One data set was cluster analyzed (by the minimum variance procedure), and the centroid vectors for the solution were calculated. After the other data set was cluster analyzed, its objects were assigned to the nearest centroid calculated in the first data set. The outcome is a kappa statistic which measures the agreement between the minimum variance cluster solution of the second data set and the classifications made by the nearest-centroid assignment rule. Results from Monte Carlo investigations of the technique showed that the new approach has considerable merit in evaluating a minimum variance solution. That is, the technique's sensitivity to the difficulty of the cluster solution (operationalized as the degree of overlap among subpopulations) and the accuracy of the solution were found to be high in computer-generated multivariate data except in cases where the data departed considerably from normality.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.