Abstract

A low-dimensional embedding of multiple nodes is very convenient for clustering, which is one of the most fundamental tasks for heterogeneous information networks (HINs). On the other hand, the random walk-based network embedding is proved to be equivalent to the method of matrix factorization whose computational cost is very expensive. Moreover, mapping different types of nodes into one metric space may result in incompatibility. To cope with the two challenges above, a meta-path embedding based clustering method (called MPEClus) is proposed in this paper. Firstly, the original network is transformed into several subnetworks with independent semantics specified by meta-paths to solve the incompatibility problem. Secondly, an approximate commute embedding method, bypassing eigen-decomposition to reduce computational cost, is leveraged to the representation learning of the nodes in each subnetwork. At last, a unified probabilistic generation model is designed to aggregate the vectorized representations learned in different metric spaces for clustering. Experiment results show that MPEClus is effective in HIN clustering and outperforms the state-of-the-art baselines on two real-world datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call