Disentangling clusters from non-Euclidean data via graph frequency reorganization

Yangli-Ao Geng,Chong-Yung Chi,Wenju Sun,Jing Zhang,Qingyong Li

doi:10.1016/j.ins.2024.120288

Abstract

In light of the growing need for non-Euclidean data analysis, graphs have been recognized as an effective tool for characterizing the distribution and correlation of such data, thus inspiring many graph-based developments for various applications such as clustering, of non-Euclidean data. However, under unsupervised scenarios, the construction of graphs from unlabeled data often involves numerous noisy links, consequently leading to serious performance degradation in concerned applications. To resolve this issue, we propose a novel method, referred to as Graph Frequency Reorganization (GFR), to enhance the discriminability of potential clusters and the associated graph quality. GFR shows capability far beyond the suboptimality in unsupervised graph construction. Furthermore, a fast version of GFR is proposed to reduce its computation overhead for large-scale datasets. Consequently, the obtained unsupervised clustering results can be significantly upgraded using the GFR data (i.e., the data after the GFR processing). To evaluate the effectiveness of the GFR, some experimental results on ten real-world datasets are provided to demonstrate that the overall clustering performance of a simple k-means using the GFR data is superior to several state-of-the-art graph-based clustering methods1.

Full Text