An Improved K-means Clustering Algorithm Towards an Efficient Data-Driven Modeling

Md Zubair,Iqbal H Sarker,M J M Chowdhury,Avijeet Shil,Mohammad Ali Moni,Md Asif Iqbal

doi:10.1007/s40745-022-00428-2

Md Zubair, Iqbal H Sarker + Show 4 more

Open Access

https://doi.org/10.1007/s40745-022-00428-2

Copy DOI

Abstract

K-means algorithm is one of the well-known unsupervised machine learning algorithms. The algorithm typically finds out distinct non-overlapping clusters in which each point is assigned to a group. The minimum squared distance technique distributes each point to the nearest clusters or subgroups. One of the K-means algorithm’s main concerns is to find out the initial optimal centroids of clusters. It is the most challenging task to determine the optimum position of the initial clusters’ centroids at the very first iteration. This paper proposes an approach to find the optimal initial centroids efficiently to reduce the number of iterations and execution time. To analyze the effectiveness of our proposed method, we have utilized different real-world datasets to conduct experiments. We have first analyzed COVID-19 and patient datasets to show our proposed method’s efficiency. A synthetic dataset of 10M instances with 8 dimensions is also used to estimate the performance of the proposed algorithm. Experimental results show that our proposed method outperforms traditional kmeans++ and random centroids initialization methods regarding the computation time and the number of iterations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Annals of Data Science	Publication Date: Jun 25, 2022
Citations: 17	License type: NO-CC CODE

R Discovery Prime

R Discovery Prime

An Improved K-means Clustering Algorithm Towards an Efficient Data-Driven Modeling

Abstract

Talk to us

Similar Papers

More From: Annals of Data Science

Lead the way for us

Similar Papers

Analysis of Building Electricity Use Pattern Using K-Means Clustering Algorithm by Determination of Better Initial Centroids and Number of Clusters
Bishnu Nepal ... Hiroya Sahashi
Energies | VOL. 12
Bishnu Nepal, et. al.Bishnu Nepal ... Hiroya Sahashi
25 Jun 2019
Energies | VOL. 12

NDPD: an improved initial centroid method of partitional clustering for big data mining
Kamlesh Kumar Pandey ... Diwakar Shukla
Journal of Advances in Management Research | VOL. 20
Kamlesh Kumar Pandey, et. al.Kamlesh Kumar Pandey ... Diwakar Shukla
23 Aug 2022
Journal of Advances in Management Research | VOL. 20

New Approach for K-mean and K-medoids Algorithm
Abhishek Patel ... Purnima Singh
International Journal of Computer Applications Technology and Research | VOL. 2
Abhishek Patel, et. al.Abhishek Patel ... Purnima Singh
10 Jan 2012
International Journal of Computer Applications Technology and Research | VOL. 2

Performance Analysis of Deterministic Centroid Initialization Method for Partitional Algorithms in Image Block Clustering
B Vinoth Kumar ... G R Karpagam
Indian Journal of Science and Technology | VOL. 8
B Vinoth Kumar, et. al.B Vinoth Kumar ... G R Karpagam
01 Apr 2015
Indian Journal of Science and Technology | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Improved K-means Clustering Algorithm Towards an Efficient Data-Driven Modeling

Abstract

Talk to us

Similar Papers

More From: Annals of Data Science