MBRep: Motif-based representation learning in heterogeneous networks

Qian Hu,Fan Lin,Beizhan Wang,Chunyan Li

doi:10.1016/j.eswa.2021.116031

Abstract

In recent years, there has been a surge of interest in applying machine learning to graphs and networks that already exist in the world around us. The approach has been successfully used for domains as diverse as traffic management, e-commerce recommendation and public opinion monitoring. A critical aspect of representation learning for applied machine learning is feature engineering. Deep learning-based research in representation learning has developed methods for automatically learning a large number of potentially correlated features from original networks. However, most of these methods cannot be applied to heterogeneous networks, which are true expressions of the real-world. This is because they do not adequately capture the structure and semantics of different types of nodes in heterogeneous networks and the links between them. They also struggle to represent higher-order heterogeneous patterns of connection. This paper proposes a generalized motif-based higher-order representation learning method, MBRep, that learns triangle motif embedding in a network, on the basis of which it can obtain the embedding and representation of nodes in a heterogeneous network. Statistically, significant motif structures are extracted from the original heterogeneous network and combined with the heterogeneity of the nodes. A weight-biased random walk is then applied to the motif level higher-order network, using a SkipGram model to embed the motifs. After this, the embedding of the original network nodes is calculated using weighted averages and feature alignment. This can then be used for link prediction. We confirmed the effectiveness of MBRep by comparing its AUC and MRR performance with other state-of-the-art methods on three real-world datasets. Its adaptability was also validated by conducting a cold-start test.

Full Text