Data Representation by Joint Hypergraph Embedding and Sparse Coding

Guo Zhong,Chi-Man Pun

doi:10.1109/tkde.2020.3009488

Abstract

Matrix factorization (MF), a popular unsupervised learning technique for data representation, has been widely applied in data mining and machine learning. According to different application scenarios, one can impose different constraints on the factorization to find the desired basis, which captures high-level semantics for the given data, and learns the compact representation corresponding to the basis. We note that almost all previous work on MF in data mining has ignored to find such a basis, which can carry high-order semantics in the data. In this article, we propose a novel MF framework called Joint Hypergraph Embedding and Sparse Coding (JHESC), in which the obtained basis captures high-order semantic information in data. Specifically, we first propose a new hypergraph learning model to obtain a more discriminative basis by hypergraph-based Laplacian Eigenmap, then sparse coding is conducted on the learned basis such that the new representation has stronger identification capability. In addition, we extend the proposed method to the reproducing kernel Hilbert space for dealing with nonlinear data more effectively. Extensive experimental results on data clustering demonstrate that the proposed method consistently outperforms the other state-of-the-art matrix factorization methods.

Full Text