Abstract

Purpose: The fields of data mining rely heavily on clustering algorithms. Spread information mining systems and fundamentally decentralized batching turned out to be generally utilized over the past 10 years, as they oversee huge and heterogeneous informational indexes that can't be gathered in the center. Objectives/Methodology: For geographic data mining datasets, numerous classification algorithms operate on both local and hierarchical levels. In this paper, we propose a novel method for clustering heterogeneous distributed datasets based on K-Means algorithms (HCA-K-Means). When the algorithm was tested against the BIRCH and DBScan algorithms, it performed better and took less time to run. Results/Findings: In both the partitioning and the organizational groups, there are some flaws. The k-means algorithm allows the number of clusters to be determined in advance for the partitioning class, but in most cases, K is not specified, moreover, hierarchical clustering algorithms have overcome this limit, but still define the stopping conditions that are not straightforward for clustering decomposition. However, the current methods for pruning immaterial groups rely on jumping hyperspheres or even jumping square forms, whose ineffectiveness in the careful search for the nearest neighbor is negated by their lack of snugness. Type of Paper: Research Paper

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call