Heterogeneous network embedding aims to learn a mapping between network data in original topological space and vectored data in low dimensional latent space, while encoding valuable information, such as structural and semantic information. The resulting vector representation has shown promising performance for extensive real-world applications, such as node classification and node clustering. However, most of existing methods merely focus on modeling network structural information, ignoring the rich multi-source information of different types of nodes. In this paper, we propose a novel Multi-source Information Fusion based Heterogeneous Network Embedding (MIFHNE) approach. We first capture the semantic information using the strategy of meta-graph based random walk. Subsequently, we jointly model the structural proximity, attribute information and label information in the framework of Nonnegative Matrix Factorization (NMF). Theoretical proofs and comprehensive experiments on two real-world heterogeneous network datasets demonstrate the feasibility and effectiveness of our approach.
Read full abstract