Abstract

We propose a hierarchical clustering method that minimizes a joint between-within measure of distance between clusters. This method extends Ward's minimum variance method, by defining a cluster distance and objective function in terms of Euclidean distance, or any power of Euclidean distance in the interval (0,2]. Ward's method is obtained as the special case when the power is 2. The ability of the proposed extension to identify clusters with nearly equal centers is an important advantage over geometric or cluster center methods. The between-within distance statistic determines a clustering method that is ultrametric and space-dilating; and for powers strictly less than 2, determines a consistent test of homogeneity and a consistent clustering procedure. The clustering procedure is applied to three problems: classification of tumors by microarray gene expression data, classification of dermatology diseases by clinical and histopathological attributes, and classification of simulated multivariate normal data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.