Abstract

ABSTRACTA Monte Carlo study compared the usefulness of six variable weighting methods for cluster analysis. Datasets were 100 bivariate observations from two subgroups, generated according to a finite normal mixture model. Subgroup size, within‐group correlation, within‐group variance, and distance between subgroup centroids were manipulated.Of the clustering methods examined, the flexible average clustering algorithm with β = ‐. 15 or ‐.20 gave the best recovery. Of the remaining methods, Ward's method yielded the best recovery, followed closely by beta‐flexible linkage (β = ‐.50) and SAS's EML algorithm.In the absence of variable weights, negative within‐group correlation resulted in much poorer recovery for all clustering algorithms. The ACE weighting method of Art, Gnanadesikan, and Kettenring provided a net improvement in 17‐24% of the datasets when used with better clustering algorithms. When used with the same clustering alogrithms, De Soete's ultrametric weighting yielded improved recovery 16‐22% of the time. However, although ultrametric weighting was more sensitive than ACE to negative within‐subgroup correlation. Clustering based on principal components was less effective. Therefore, the ACE method is preferred overall.There is still room for improvement, however. Clustering with Mahalanobis distance based on the pooled within‐group covariance matrix indicated that knowing the correct covariance matrix would yield improved recovery (over ACE) approximately 10% of the time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.