Abstract

Clustering algorithms have been adopted in a wide range of applications in big data and pattern recognition. Optimized clustering is generally known as an NP-hard problem, and the convergence time is hard to determine, which results in challenging issues in deploying clustering methods in resource-constrained systems. In this paper, we present a mathematical framework to prove the convergence of the Gravity Based Clustering (GBC) algorithm, which is a recently introduced centroid-based clustering method. First, we provide an analysis of the convergence of cluster centers generated by GBC in the Euclidean norm. Following this convergence analysis, we introduce cluster sets as random variables and verify the convergence of cluster sets in probability. Then, in addition to GBC, we show that this framework can also be applied to model the convergence of the cluster sets generated by any centroid-based clustering algorithm with convergent cluster centers. To demonstrate the accuracy of our framework, we carry out extensive experiments with several benchmarking datasets. Moreover, the values of parameters <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$p$</tex> and ∊, which play the key role in the convergence rate of GBC, are determined. We compare the convergence rate of the K-means and the C-means methods with GBC, where we use the best values of the parameters <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$p$</tex> and ∊ to implement GBC. Our results point out that GBC is faster than the baseline methods in several datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call