Abstract

Clustering is a fundamental data analysis task that presents challenges. Choosing proper initialization centroid techniques is critical to the success of clustering algorithms, such as k-means. The current work investigates six established methods (random, Forgy, k-means++, PCA, hierarchical clustering, and naive sharding) and three innovative swarm intelligence-based approaches—Spider Monkey Optimization (SMO), Whale Optimization Algorithm (WOA) and Grey Wolf Optimizer (GWO)—for k-means clustering (SMOKM, WOAKM, and GWOKM). The results on ten well-known datasets strongly favor swarm intelligence-based techniques, with SMOKM consistently outperforming WOAKM and GWOKM. This finding provides critical insights into selecting and evaluating centroid techniques in k-means clustering. The current work is valuable because it provides guidance for those seeking optimal solutions for clustering diverse datasets. Swarm intelligence, especially SMOKM, effectively generates distinct and well-separated clusters, which is valuable in resource-constrained settings. The research also sheds light on the performance of traditional methods such as hierarchical clustering, PCA, and k-means++, which, while promising for specific datasets, consistently underperform swarm intelligence-based alternatives. In conclusion, the current work contributes essential insights into selecting and evaluating initialization centroid techniques for k-means clustering. It highlights the superiority of swarm intelligence, particularly SMOKM, and provides actionable guidance for addressing various clustering challenges.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call