Experiments for the Number of Clusters in K-Means

Mark Ming-Tso Chiang,Boris Mirkin

doi:10.1007/978-3-540-77002-2_33

Abstract

K-means is one of the most popular data mining and unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a pre-specified number of clusters K, therefore the problem of determining "the right number of clusters" has attracted considerable interest. However, to the authors' knowledge, no experimental results of their comparison have been reported so far. This paper presents results of such a comparison involving eight selection options presenting four approaches. We generate data according to a Gaussian-mixture distribution with clusters' spread and spatial sizes variant. Most consistent results are shown by the least squares and least modules version of an intelligent version of the method, iK-Means by Mirkin [14]. However, the right K is reproduced best by the Hartigan's [5] method. This leads us to propose an adjusted iK-Means method, which performs well in the current experiment setting.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Experiments for the Number of Clusters in K-Means

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

An unsupervised transfer learning approach to discover topics for online reputation management
Tamara Martín-Wanton ... Enrique Amigó
-
Tamara Martín-Wanton, et. al.Tamara Martín-Wanton ... Enrique Amigó
01 Jan 2013
01 Jan 2013

Defect clustering and classification for semiconductor devices
B Kundu ... K.P White
-
B Kundu, et. al.B Kundu ... K.P White
04 Aug 2002
04 Aug 2002

Clustering of DOA data in radar pulse based on SOFM and CDbw
Shengbo Dai ... Di Wang
Journal of Electronics (China) | VOL. 31
Shengbo Dai, et. al.Shengbo Dai ... Di Wang
01 Apr 2014
Journal of Electronics (China) | VOL. 31

Choosing the number of clusters
Boris Mirkin
WIREs Data Mining and Knowledge Discovery | VOL. 1
Boris MirkinBoris Mirkin
08 Mar 2011
WIREs Data Mining and Knowledge Discovery | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Experiments for the Number of Clusters in K-Means

Abstract

Talk to us

Similar Papers