Maxmin Data Range Heuristic-Based Initial Centroid Method of Partitional Clustering for Big Data Mining

Kamlesh Kumar Pandey,Diwakar Shukla

doi:10.4018/ijirr.289954

Kamlesh Kumar Pandey, Diwakar Shukla

Open Access

https://doi.org/10.4018/ijirr.289954

Copy DOI

Abstract

The centroid-based clustering algorithm depends on the number of clusters, initial centroid, distance measures, and statistical approach of central tendencies. The initial centroid initialization algorithm defines convergence speed, computing efficiency, execution time, scalability, memory utilization, and performance issues for big data clustering. Nowadays various researchers have proposed the cluster initialization techniques, where some initialization techniques reduce the number of iterations with the lowest cluster quality, and some initialization techniques increase the cluster quality with high iterations. For these reasons, this study proposed the initial centroid initialization based Maxmin Data Range Heuristic (MDRH) method for K-Means (KM) clustering that reduces the execution times, iterations, and improves quality for big data clustering. The proposed MDRH method has compared against the classical KM and KM++ algorithms with four real datasets. The MDRH method has achieved better effectiveness and efficiency over RS, DB, CH, SC, IS, and CT quantitative measurements.

Highlights

The rapid development of digital technologies had produced enormous amounts of data in a different format at high speed, such as social media
This paper summarizes the value, veracity, variability, and visualization characteristics of big data as “ Veracity validates the accuracy basis of variety, the value identifies predicted value based on volume and variety, variability presents specific analysis tools based on the volume and variety, and visualization visualized the results and problems based on the volume, variety, and velocity.”
Efficiency and effectiveness related results shown in table 3-4 and reported results of each evaluation measure are showing the average value of ten trials

Summary

INTRODUCTION

The rapid development of digital technologies had produced enormous amounts of data in a different format at high speed, such as social media. Pros and cons examinations of the initial centroid methods are shown in table 1 for big data clustering through the discussed literature and comparative analysis (Celebi et al, 2013; Fränti & Sieranoja, 2019; He et al, 2004; Peña et al, 1999; Steinley & Brusco, 2007) using random centroid, random partition, repeated heuristics, maxmin/distance optimization, greedy heuristics, sort heuristics, projection heuristics, density heuristics, and split heuristics categories This parameter identified data processing capability in the massive datasets for achieving the initial centroid. The proposed work increased the convergence speed, speed-up, and removed the worst case of local optima without the effect of cluster quality and objective

Objective

Evaluation Criteria

Results and Discussion

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Information Retrieval Research	Publication Date: Oct 19, 2021
Citations: 5	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Maxmin Data Range Heuristic-Based Initial Centroid Method of Partitional Clustering for Big Data Mining

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Information Retrieval Research

Lead the way for us

Similar Papers

NDPD: an improved initial centroid method of partitional clustering for big data mining
Kamlesh Kumar Pandey ... Diwakar Shukla
Journal of Advances in Management Research | VOL. 20
Kamlesh Kumar Pandey, et. al.Kamlesh Kumar Pandey ... Diwakar Shukla
23 Aug 2022
Journal of Advances in Management Research | VOL. 20

PENINGKATAN KINERJA ALGORITMA K MEANS DENGAN MENGGUNAKAN PARTICLE SWARM OPTIMIZATION DALAM PENGELOMPOKAN DATA PENYEDIAAN AKSES
Ari Yunus Hendrawan
Electro Luceat | VOL. 6
Ari Yunus HendrawanAri Yunus Hendrawan
03 Nov 2020
Electro Luceat | VOL. 6

New Approach for K-mean and K-medoids Algorithm
Abhishek Patel ... Purnima Singh
International Journal of Computer Applications Technology and Research | VOL. 2
Abhishek Patel, et. al.Abhishek Patel ... Purnima Singh
10 Jan 2012
International Journal of Computer Applications Technology and Research | VOL. 2

Antlion Optimizer Algorithm Modification for Initial Centroid Determination in K-means Algorithm
Nanang Lestio Wibowo ... Moch Arief Soeleman
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) | VOL. 7
Nanang Lestio Wibowo, et. al. Nanang Lestio Wibowo ... Moch Arief Soeleman
12 Aug 2023
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Maxmin Data Range Heuristic-Based Initial Centroid Method of Partitional Clustering for Big Data Mining

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Information Retrieval Research