Abstract

Data clustering is a process where a set of data points is divided into groups of similar points. Recent approaches for data clustering have seen the development of unsupervised learning algorithms based on Particle Swarm Optimization (PSO) techniques. These include Particle Swarm Clustering (PSC) and Modified PSC (mPSC) algorithms for solving clustering problems. However, the PSC and mPSC algorithms tend to be computationally expensive when applied to datasets that have higher levels of dimensionality and large volumes. This paper presents a novel and more efficient swarm clustering strategy we call Rapid Centroid Estimation (RCE). We compare the performance of RCE with the performance of PSC and mPSC in several ways including complexity analyses and particle behavior analyses. Our benchmark testing suggests that RCE can reach a solution 274 times quicker than PSC and 270 times quicker than mPSC for a clustering task where the dataset has a dimension of 80 and a volume of 500. We also investigated particle behaviors on two-class two-dimensional datasets with volume of 500, presenting 250 data for each well-separated class with known Gaussian centers. We found that RCE converged to the appropriate centers at 70 updates on average, compared to 19802 updates for PSC and 23006 updates for mPSC. An ANOVA indicates RCE is significantly faster than both PSC and mPSC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call