Determining the number of clusters, before finding clusters, from the susceptibility of the similarity matrix

E Lippiello,S Baccari,P Bountzis

doi:10.1016/j.physa.2023.128592

Abstract

Clustering represents a fundamental procedure to provide users with meaningful insights from an original data set. The quality of the resulting clusters is largely dependent on the correct estimation of their number, K∗, which must be provided as an input parameter in many clustering algorithms. Only very few techniques provide an automatic detection of K∗ and are usually based on cluster validity indexes which are expensive with regard to computation time. Here, we present a new algorithm which allows one to obtain an accurate estimate of K∗, without partitioning data into the different clusters. This makes the algorithm particularly efficient in handling large-scale data sets from both the perspective of time and space complexity. The algorithm, indeed, highlights the block structure which is implicitly present in the similarity matrix, and associates K∗ to the number of blocks in the matrix. We test the algorithm on synthetic data sets with or without a hierarchical organization of elements. We explore a wide range of K∗ and show the effectiveness of the proposed algorithm to identify K∗, even more accurate than existing methods based on standard internal validity indexes, with a huge advantage in terms of computation time and memory storage. We also discuss the application of the novel algorithm to the de-clustering of instrumental earthquake catalogs, a procedure finalized to identify the level of background seismic activity useful for seismic hazard assessment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Determining the number of clusters, before finding clusters, from the susceptibility of the similarity matrix

Abstract

Talk to us

Similar Papers

More From: Physica A: Statistical Mechanics and its Applications

Lead the way for us

Journal: Physica A: Statistical Mechanics and its Applications	Publication Date: Mar 2, 2023
Citations: 1

Similar Papers

A Novel Cluster Validity Index Based on Local Cores.
Dongdong Cheng ... Qingsheng Zhu
IEEE Transactions on Neural Networks and Learning Systems | VOL. 30
Dongdong Cheng, et. al.Dongdong Cheng ... Qingsheng Zhu
02 Aug 2018
IEEE Transactions on Neural Networks and Learning Systems | VOL. 30

Efficient synthetical clustering validity indexes for hierarchical clustering
Qin Xu ... Bin Luo
Expert Systems with Applications | VOL. 151
Qin Xu, et. al.Qin Xu ... Bin Luo
13 Mar 2020
Expert Systems with Applications | VOL. 151

Object-based cluster validation with densities
Behnam Tavakkol ... Susan L Albin
Pattern Recognition | VOL. 121
Behnam Tavakkol, et. al.Behnam Tavakkol ... Susan L Albin
04 Aug 2021
Pattern Recognition | VOL. 121

A COMPARATIVE STUDY OF CLUSTER VALIDITY INDICES
N E Kondruk
Radio Electronics, Computer Science, Control | VOL. 0
N E KondrukN E Kondruk
25 Nov 2019
Radio Electronics, Computer Science, Control | VOL. 0

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Determining the number of clusters, before finding clusters, from the susceptibility of the similarity matrix

Abstract

Talk to us

Similar Papers

More From: Physica A: Statistical Mechanics and its Applications