The effectiveness of lloyd-type methods for the k-means problem

Rafail Ostrovsky,Leonard J Schulman,Chaitanya Swamy,Yuval Rabani

doi:10.1145/2395116.2395117

Abstract

We investigate variants of Lloyd's heuristic for clustering high-dimensional data in an attempt to explain its popularity (a half century after its introduction) among practitioners, and in order to suggest improvements in its application. We propose and justify aclusterabilitycriterion for data sets. We present variants of Lloyd's heuristic that quickly lead to provably near-optimal clustering solutions when applied to well-clusterable instances. This is the first performance guarantee for a variant of Lloyd's heuristic. The provision of a guarantee on output quality does not come at the expense of speed: some of our algorithms are candidates for beingfaster in practicethan currently used variants of Lloyd's method. In addition, our other algorithms are faster on well-clusterable instances than recently proposed approximation algorithms, while maintaining similar guarantees on clustering quality. Our main algorithmic contribution is a novel probabilistic seeding process for the starting configuration of a Lloyd-type iteration.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of the ACM	Publication Date: Dec 1, 2012
Citations: 177	License type: mit

R Discovery Prime

R Discovery Prime

The effectiveness of lloyd-type methods for the k-means problem

Abstract

Talk to us

Similar Papers

More From: Journal of the ACM

Lead the way for us

Similar Papers

The Effectiveness of Lloyd-Type Methods for the k-Means Problem
Rafail Ostrovsky ... Leonard Schulman
-
Rafail Ostrovsky, et. al.Rafail Ostrovsky ... Leonard Schulman
01 Jan 2006
01 Jan 2006

Detecting Meaningful Clusters From High-Dimensional Data: A Strongly Consistent Sparse Center-Based Clustering Approach.
Saptarshi Chakraborty ... Swagatam Das
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 44
Saptarshi Chakraborty, et. al.Saptarshi Chakraborty ... Swagatam Das
25 Dec 2020
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 44

MinCEntropy: A Novel Information Theoretic Approach for the Generation of Alternative Clusterings
Nguyen Xuan Vinh ... Julien Epps
-
Nguyen Xuan Vinh, et. al.Nguyen Xuan Vinh ... Julien Epps
01 Dec 2010
01 Dec 2010

Clustering Evaluation in High-Dimensional Data
Nenad Tomašev ... Miloš Radovanović
-
Nenad Tomašev, et. al.Nenad Tomašev ... Miloš Radovanović
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The effectiveness of lloyd-type methods for the k-means problem

Abstract

Talk to us

Similar Papers

More From: Journal of the ACM