Indonesia’s frequent earthquakes, caused by its position at the convergence of multiple tectonic plates, Indonesia's frequent earthquakes, caused by its position at the convergence of multiple tectonic plates, necessitate precise seismic zone identification to improve disaster preparedness. This research evaluates the effectiveness of five clustering algorithms—K-Medoids, K-Means, DBSCAN, Fuzzy C-Means, and K-Affinity Propagation (K-AP)—for analyzing earthquake data from January 2017 to January 2023. Using a dataset from BMKG encompassing 13,860 seismic events, each algorithm was assessed based on Silhouette Score and Cluster Purity metrics. Results indicated that K-Means provided the best balance, forming six clusters with a Silhouette Score of 0.3245 and Cluster Purity of 0.7366, making it the most suitable for seismic zone analysis. K-Medoids closely followed with a Silhouette Score of 0.3158 and Cluster Purity of 0.7190. Although DBSCAN effectively handled noise, its negative Silhouette values indicated poor clustering quality. Fuzzy C-Means and K-AP underperformed, with K-AP generating an impractically high number of clusters (196) and the lowest Silhouette Score (0.2550). This study offers a novel, comprehensive comparison of clustering algorithms for Indonesian earthquake data, emphasizing a dual-metric evaluation approach. By identifying K-Means as the most effective algorithm, provides valuable insights for disaster mitigation and seismic risk analysis.
Read full abstract