K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions

Abiodun M Ikotun,Absalom E Ezugwu,Mubarak S Almutari

doi:10.3390/app112311246

Abiodun M Ikotun, Absalom E Ezugwu + Show 1 more

Open Access

https://doi.org/10.3390/app112311246

Copy DOI

Abstract

K-means clustering algorithm is a partitional clustering algorithm that has been used widely in many applications for traditional clustering due to its simplicity and low computational complexity. This clustering technique depends on the user specification of the number of clusters generated from the dataset, which affects the clustering results. Moreover, random initialization of cluster centers results in its local minimal convergence. Automatic clustering is a recent approach to clustering where the specification of cluster number is not required. In automatic clustering, natural clusters existing in datasets are identified without any background information of the data objects. Nature-inspired metaheuristic optimization algorithms have been deployed in recent times to overcome the challenges of the traditional clustering algorithm in handling automatic data clustering. Some nature-inspired metaheuristics algorithms have been hybridized with the traditional K-means algorithm to boost its performance and capability to handle automatic data clustering problems. This study aims to identify, retrieve, summarize, and analyze recently proposed studies related to the improvements of the K-means clustering algorithm with nature-inspired optimization techniques. A quest approach for article selection was adopted, which led to the identification and selection of 147 related studies from different reputable academic avenues and databases. More so, the analysis revealed that although the K-means algorithm has been well researched in the literature, its superiority over several well-established state-of-the-art clustering algorithms in terms of speed, accessibility, simplicity of use, and applicability to solve clustering problems with unlabeled and nonlinearly separable datasets has been clearly observed in the study. The current study also evaluated and discussed some of the well-known weaknesses of the K-means clustering algorithm, for which the existing improvement methods were conceptualized. It is noteworthy to mention that the current systematic review and analysis of existing literature on K-means enhancement approaches presents possible perspectives in the clustering analysis research domain and serves as a comprehensive source of information regarding the K-means algorithm and its variants for the research community.

Highlights

Data clustering is an aspect of data mining that aims at classifying or grouping data objects within a dataset based on their similarities and dissimilarities
The articles that were selected for this study were based on metrics such as article publishers, journals, citation numbers, and the impact factors
The largest number of articles were selected from IEEE with 46 articles, followed by Springer and Elsevier with 37 articles and 30 articles, respectively

Summary

Introduction

Data clustering is an aspect of data mining that aims at classifying or grouping data objects within a dataset based on their similarities and dissimilarities. A dataset is segmented into clusters so that the data objects within the same cluster are more similar than those in other clusters. In the hierarchical clustering technique, data objects are iteratively grouped in a hierarchical format to generate a dendrogram that depicts the clustering sequence of the dataset. For the K-means algorithm, objects are grouped into a user-specified ‘k’ number of clusters based on the minimum distance between the data objects and cluster centers [3]. The dependability of the algorithm on the user’s specification of the number of clusters and the random initialization of the initial cluster center limits the performance and the accuracy of the cluster results. Different initial values of k produce different clustering results, and the random selection of the initial clusters makes the algorithm tends toward converging into local minimal

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Nov 26, 2021
Citations: 42	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Nature-inspired metaheuristic techniques for automatic clustering: a survey and performance study
Absalom E Ezugwu
SN Applied Sciences | VOL. 2
Absalom E EzugwuAbsalom E Ezugwu
25 Jan 2020
SN Applied Sciences | VOL. 2

Automatic Data Clustering Using Hybrid Firefly Particle Swarm Optimization Algorithm
Moyinoluwa B Agbaje ... Rosanne Els
IEEE Access | VOL. 7
Moyinoluwa B Agbaje, et. al.Moyinoluwa B Agbaje ... Rosanne Els
01 Jan 2019
IEEE Access | VOL. 7

K-Means Hybridization with Enhanced Firefly Algorithm for High-Dimension Automatic Clustering
Afroj Alam ... Muhammad Kalamuddin Ahamad
Journal of Advanced Research in Applied Sciences and Engineering Technology | VOL. 33
Afroj Alam, et. al. Afroj Alam ... Muhammad Kalamuddin Ahamad
09 Nov 2023
Journal of Advanced Research in Applied Sciences and Engineering Technology | VOL. 33

Exploring meta-heuristics for partitional clustering: methods, metrics, datasets, and challenges
Arvinder Kaur ... Jagpreet Sidhu
Artificial Intelligence Review | VOL. 57
Arvinder Kaur, et. al.Arvinder Kaur ... Jagpreet Sidhu
12 Sep 2024
Artificial Intelligence Review | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences