Abstract

Given a real-world graph, how can we measure relevance scores for ranking and link prediction? Random walk with restart (RWR) provides an excellent measure for this and has been applied to various applications such as friend recommendation, community detection, anomaly detection, etc. However, RWR suffers from two problems: 1) using the same restart probability for all the nodes limits the expressiveness of random walk, and 2) the restart probability needs to be manually chosen for each application without theoretical justification. We have two main contributions in this paper. First, we propose Random Walk with Extended Restart (RWER), a random walk based measure which improves the expressiveness of random walks by using a distinct restart probability for each node. The improved expressiveness leads to superior accuracy for ranking and link prediction. Second, we propose SuRe (Supervised Restart for RWER), an algorithm for learning the restart probabilities of RWER from a given graph. SuRe eliminates the need to heuristically and manually select the restart parameter for RWER. Extensive experiments show that our proposed method provides the best performance for ranking and link prediction tasks.

Highlights

  • How can we measure effective node-to-node proximities for graph mining applications such as ranking and link prediction? Measuring relevance scores between nodes is a fundamental tool for many graph mining applications [1, 2, 3, 4, 5]

  • We propose RANDOM WALK WITH EXTENDED RESTART (RWER), a new random walk model to improve the expressiveness of Random Walk with Restart (RWR)

  • The main idea of RWER is that we introduce a restart probability vector each of whose entry corresponds to a restart probability at a node, so that the restart probabilities are related to the preferences for the nodes

Read more

Summary

Introduction

How can we measure effective node-to-node proximities for graph mining applications such as ranking and link prediction? Measuring relevance (i.e., proximity or similarity) scores between nodes is a fundamental tool for many graph mining applications [1, 2, 3, 4, 5]. RWR assumes a fixed restart probability on all nodes, i.e., a random surfer jumps back to the query node with the same probability regardless of where the surfer is located.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call