Neighbor2vec: An Efficient Method for Graph Embedding

Zhiming Lin

doi:10.1109/isctech58360.2022.00088

Abstract

There are many challenges in node representations in large-scale networks with both the node's property well represented and the network's structure involved. Many works use “walks” generated by walk-based sampling strategies to generate latent neighborhood representations for the nodes in networks by treating them as the equivalent of sentences. However, these walk- based sampling strategies do not represent the neighborhood relationship sufficiently because only two neighbors of one central node are sampled in one walk, which tends to be insufficient for the nodes that maintain a large number of neighbors. Besides, the existing works ignore the total structure of the network. In this paper we propose neighbor2vec, a neighbor-focus sampling strategy based algorithm, a framework to collect the structure information through information propagation from the node to its neighbors. Here we argue that neighbor2vec is simple and effective to enhance the equality and scalability of graph embedding, and to outperform the abilities of the existing state-of-the-art methods. We demonstrate that neigh-bor2vec's latent representations on three link prediction and three node classification tasks for graph datasets such as ogbl-ppa, ogbl-collab, ogbl-citation2 ogbn-arxiv, ogbn-products as well as ogbn-proteins, Neighbor2vec's embeddings can promote accuracy scores to 10.8 percent, which is higher compared with current sota competive methods, and therefore can have advantages over almost all baseline methods as well as GNN frameworks in the experiments.

Full Text