Multi-source data has received increasing attention due to its excellent performance in clustering tasks. However, existing multi-source data clustering methods utilize shallow graph learning methods to model similarity graphs of multi-source data, and ignore the importance of weak connections between samples within the same cluster. Thus, these connections fail to be explored in the global similarity graph of multi-source data. In this paper, we proposed a unified representation aggregated from graph and global features (UGGF) method for multi-source data clustering. Specifically, it utilizes the cross-source graph diffusion process to preserve invariant connections in multi-source similarity graphs and preserve weak connections between samples through their connection relationships between different sources. Furthermore, inspired by self-attention, the encoder of transformers is used to learn multi-source global feature representation. Then, global features and similarity graphs are integrated to comprehensively explore the multi-source data, aiming to obtain a unified representation for the final clustering task. The proposed UGGF method is validated on four benchmark datasets. Extensive experimental results demonstrate the superiority of the proposed method compared with other state-of-the-art multi-source data clustering methods.
Read full abstract