Abstract

Network representation learning algorithms, which aim at automatically encoding graphs into low-dimensional vector representations with a variety of node similarity definitions, have a wide range of downstream applications. Most existing methods either have low accuracies in downstream tasks or a very limited application field, such as article classification in citation networks. In this paper, we propose a novel network representation method, named Link Prediction based Network Representation (LPNR), which generalizes the latest graph neural network and optimizes a carefully designed objective function that preserves linkage structures. LPNR can not only learn meaningful node representations that achieve competitive accuracy in node centrality measurement and community detection but also achieve high accuracy in the link prediction task. Experiments prove the effectiveness of LPNR on three real-world networks. With the mini-batch and fixed sampling strategy, LPNR can learn the embedding of large graphs in a few hours.

Highlights

  • Most real-world data naturally come in the form of pairwise relations, such as protein-protein interactions in human cells, citation relations in scientific research, and drug-target interactions in medicine discovery[1,2,3]

  • With the understanding that network representations are capable of revealing hidden network structure and metric space of the nodes[39], we use the well-trained node features extracted from Link Prediction based Network Representation (LPNR) to measure node centralities

  • We propose LPNR, which learns network representations based on a graph neural network and a fixed sampling strategy

Read more

Summary

Introduction

Most real-world data naturally come in the form of pairwise relations, such as protein-protein interactions in human cells, citation relations in scientific research, and drug-target interactions in medicine discovery[1,2,3]. Approaches predict links according to the local similarity of a graph, based on the assumption that two nodes are more likely to be connected if they have many common neighbors[20,25] These algorithms are fast and highly parallel, because they only consider the local structure. Global similarity based methods use the topological information about the whole network to calculate the similarities of the nodes[25,26,27] They have high prediction accuracies, they usually suffer from high computational complexity, which prevents them from being applied on complex graphs with millions of nodes and billions of edges. With the mechanism of the graph nueral networks, LPNR can improve the accuracy of link prediction and learn meaningful node representations, on which reasonable node ranking and proper community labels can be obtained in an unsupervised way. We use the logistic regression model to evaluate the edge with existing probability between the nodes

Decoding node pair relations and the fixed sampling strategy
Training LPNR
Experimental setup
Link prediction
Centrality measurement
Node clustering
Parameter sensitivity and sampling strategy
Findings
Conclusion and Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call