A variable-selection heuristic for K-means clustering

Michael J Brusco,J Dennis Cradit

doi:10.1007/bf02294838

A variable-selection heuristic for K-means clustering

Michael J Brusco, J Dennis Cradit

https://doi.org/10.1007/bf02294838

Copy DOI

Journal: Psychometrika	Publication Date: Jun 1, 2001
Citations: 146

Affiliation: Florida State University

#Problems In Cluster Analysis #Weighting Of Variables + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

One of the most vexing problems in cluster analysis is the selection and/or weighting of variables in order to include those that truly define cluster structure, while eliminating those that might mask such structure. This paper presents a variable-selection heuristic for nonhierarchical (K-means) cluster analysis based on the adjusted Rand index for measuring cluster recovery. The heuristic was subjected to Monte Carlo testing across more than 2200 datasets with known cluster structure. The results indicate the heuristic is extremely effective at eliminating masking variables. A cluster analysis of real-world financial services data revealed that using the variable-selection heuristic prior to the K-means algorithm resulted in greater cluster stability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Psychometrika

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.