An efficient k-means clustering algorithm: analysis and implementation

T Kanungo,A.Y Wu,N.S Netanyahu,D.M Mount,C.D Piatko,R Silverman

doi:10.1109/tpami.2002.1017616

Abstract

In k-means clustering, we are given a set of n data points in d-dimensional space R/sup d/ and an integer k and the problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. A popular heuristic for k-means clustering is Lloyd's (1982) algorithm. We present a simple and efficient implementation of Lloyd's k-means clustering algorithm, which we call the filtering algorithm. This algorithm is easy to implement, requiring a kd-tree as the only major data structure. We establish the practical efficiency of the filtering algorithm in two ways. First, we present a data-sensitive analysis of the algorithm's running time, which shows that the algorithm runs faster as the separation between clusters increases. Second, we present a number of empirical studies both on synthetically generated data and on real data sets from applications in color quantization, data compression, and image segmentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An efficient k-means clustering algorithm: analysis and implementation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence

Lead the way for us

Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence	Publication Date: Jul 1, 2002
Citations: 4927

Similar Papers

The analysis of a simple k -means clustering algorithm
Tapas Kanungo ... David M Mount
-
Tapas Kanungo, et. al.Tapas Kanungo ... David M Mount
01 May 2000
01 May 2000

Segmentation for embryonated Egg Images Detection using the K-Means Algorithm in Image Processing
Shoffan Saifullah
-
Shoffan SaifullahShoffan Saifullah
03 Nov 2020
03 Nov 2020

Top-𝑘-convolution and the quest for near-linear output-sensitive subset sum
Karl Bringmann ... Vasileios Nakos
-
Karl Bringmann, et. al.Karl Bringmann ... Vasileios Nakos
22 Jun 2020
22 Jun 2020

Separate, Measure and Conquer
Serge Gaspers ... Gregory B Sorkin
ACM Transactions on Algorithms | VOL. 13
Serge Gaspers, et. al.Serge Gaspers ... Gregory B Sorkin
31 Oct 2017
ACM Transactions on Algorithms | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An efficient k-means clustering algorithm: analysis and implementation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence