Abstract

Heterogeneous information network embedding has been intensively studied in the past years. However, existing methods require users to manually assign meta-paths or meta-graphs in advance. Meanwhile, most of previous approaches only consider a single type of meta-path or meta-graph which is usually sparse and biased, and thus the node representations learned may be incomprehensive and inaccurate. To tackle these limitations, we proposed an extensible semantic description structure, called Composite Meta-Graph(CMG). By virtue of such a structure, users do not need to worry about selection of an appropriate meta-path or meta-graph. Rich semantic relations and rich structural contexts between nodes of different types and of different distances can be elaborated accurately according to CMG. Moreover, a CMG based heterogeneous information embedding framework, namely CMG2Vec, is also proposed. By expanding the auto-encoder into a heterogeneous network scenario, CMG2Vec can embed proximities between nodes of multiple orders learned from CMG into latent representations after a series of encoding–decoding non-linear mapping. During the fusing process, an attention mechanism is adopted to automatically learn weights of these latent vectors, which enables each final node representation to focus on proximity of the most informative order. Experimental results on three large-scale datasets demonstrate that our method outperforms existing state-of-the-art homogeneous and heterogeneous network embedding approaches in three network mining tasks in terms of node classification, node clustering, and node similarity search.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call