Clustering of massive datasets using an Adaptive and efficient K-Means approach

V.tharakeswari V.Tharakeswari,Shaik Mohammed Imran,Muthukumaran M Muthukumaran M

doi:10.58599/ijsmem.2023.1204

Abstract

In today’s technology-driven and Internet-obsessed society, it can be challenging to go through huge amounts of information and find relevant knowledge for various educational contexts. Simple, fast, and adaptable machine learning algorithms make such tasks easier to complete. K-means is the most effective unsupervised learning technique for classifying data into meaningful groups. K-means groups data by shared characteristics. K-means clusters are determined by k. Unfortunately, standard k-means requires a lot of math. Scholars have suggested strategies to improve k-means grouping. This work recommends computing initial centroids and establishing a distance between data points that are unlikely to change their cluster in subsequent iterations and those that are extremely likely to do so to lessen the load of k-means clustering for very large data sets. This piece will find information digits whose cluster is statistically likely to alter in the following few cycles. After processing several datasets, it is compared to other K-Means methods

Full Text