Abstract

Data clustering is frequently utilized in the early stages of analyzing big data. It enables the examination of massive datasets encompassing diverse types of data, with the aim of revealing undiscovered correlations, concealed patterns, and other valuable information that can be leveraged. The assessment of algorithms designed for handling large-scale data poses a significant research challenge across various fields. Evaluating the performance of different algorithms in processing massive data can yield diverse or even contradictory results, a phenomenon that remains insufficiently explored. This paper seeks to address this issue by proposing a solution framework for evaluating clustering algorithms, with the objective of reconciling divergent or conflicting evaluation outcomes. “The multicriteria decision making (MCDM) method” is used to assess the clustering algorithms. Using the EDAS rating system, the report examines six alternative clustering algorithms “the KM algorithm, EM algorithm, filtered clustering (FC), farthest-first (FF) algorithm, make density-based clustering (MD), and hierarchical clustering (HC)”—against, six clustering external measures. The Expectation Maximization (EM) algorithm has an ASi value of 0.048021 and is ranked 5th among the clustering algorithms. The Farthest-First (FF) Algorithm has an ASi value of 0.753745 and is ranked 2nd. The Filtered Clustering (FC) algorithm has an ASi value of 0.055173 and is ranked 4th. The Hierarchical Clustering (HC) algorithm has the highest ASi value of 0.929506 and is ranked 1st. The Make Density-Based Clustering (MD) algorithm has an ASi value of 0.011219 and is ranked 6th. Lastly, the K-Means Algorithm has an ASi value of 0.055376 and is ranked 3rd. These ASi values provide an assessment of each algorithm’s overall performance, and the rankings offer a comparative analysis of their performance. Based on the result, we observe that the Hierarchical Clustering algorithm achieves the highest ASi value and is ranked first, indicating its superior performance compared to the other algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call