Abstract

Network embedding maps the nodes of a network to a continuous vector space, which can then be used as the input to downstream tasks, such as node classification, node clustering, link prediction, and similarity search. To learn network embedding more effectively, many technologies adopt the approach of random walk to obtain the network structure. As the meta-path of heterogeneous networks emerges, network embedding will be equipped with more semantic interpretation. Consequently, various random walks, based on meta-path strategies, have been proposed for network embedding. However, the combination of semantic and structure in a heterogeneous network cannot achieve ideal results. To overcome this challenge, we start from a task-guided issue by combining the timestamps information in the heterogeneous network, and then employing the method of temporal segmentation to decompose the network into a continuous temporal sequence. Finally, the set of context-paths between nodes is calculated in a continuous vector by the depth-first meta-path search algorithm. More precisely, we propose a Temporal Sliding Density Walk (TSDW) algorithm by combining network semantics and structure effectively. Empirical results for network data show that TSDW could significantly outperform the state-of-the-art representation learning models, including DeepWalk, LINE, Node2vec, PTE, Meapath2vec, HIN2vec, HTNE, and CTDNE by 3.02% to 44.9% of Macro-F1, 0.9% to 18.92% of Micro-F1 in multi-class node classification and 21% to 47% of NMI in node clustering.

Highlights

  • An increasing number of networks, such as literature and film-rating networks, are being represented as heterogeneous due to a deeper understanding regarding real-world networks among researchers

  • Our strategy works in two steps: 1) efficient temporal segmentation and time window sliding, which assume that the nodes and links are collected to express the strongest semantic connection in a certain period; 2) we attempt to combine the semantics with the structure information in the snapshot by using a density walk

  • Example 1: Fig.1(a) shows a temporal heterogeneous network consisted of four different types of nodes (author (A), paper (P), venue (V), domain (D)), and four different types of edges (A↔P: an author writes papers or a paper is written by an author, P↔D: a paper belongs to a domain or a domain contains a paper, P↔V: a paper is published in a venue or a venue publishes a paper, P↔P: a paper cites another paper or a paper is cited by another paper)

Read more

Summary

INTRODUCTION

An increasing number of networks, such as literature and film-rating networks, are being represented as heterogeneous due to a deeper understanding regarding real-world networks among researchers. The most essential goal in embedding a traditional network is to map the nodes (or links) of a heterogeneous network to a continuous low-dimensional vector space, usually the random walk method is used [3] This approach collects neighboring nodes randomly (or restrictedly) and establishes a long enough walk path (node sequence) to obtain as much topology of a network as possible. We propose a heterogeneous network embedding strategy using Temporal Sliding Density Walk Strategy (TSDW), rather than embedding with random walks based on expert-directed meta-paths. Our strategy works in two steps: 1) efficient temporal segmentation and time window sliding, which assume that the nodes and links are collected to express the strongest semantic connection in a certain period; 2) we attempt to combine the semantics with the structure information in the snapshot by using a density walk. The results show that our algorithm performs well in the specific task-guided mission

RELATED WORKS
PRELIMINARIES
TEMPORAL HETEROGENEOUS NETWORK EMBEDDING WITH TSDW
EXPERIMENTS
EXPERIMENTAL SETUP
DISCUSSION
Findings
VIII. CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call