Abstract

As the internet and the Internet of Things (IoT) have been widely applied in many application fields, a large number of graphs are continuously produced and change over time, which leads to difficulties in graph analysis and utilization. This paper studies a PageRank update algorithm for a streaming graph using incremental random walk. We focus on the information about the local changes of nodes and edges in the current graph, analyze the impact of such local changes on this current graph, and then use the idea behind wave propagation theory to seek and determine all affected nodes that need to be updated their PageRank in the new graph. For new nodes, the existing nodes in the current graph that are connected with these new nodes are aggregated into a supernode, and the PageRank of the new nodes is solved in the new graph with the supernode. Finally, we conduct a series of experiments on real-world and synthetic graph datasets. Compared with the state-of-the-art incremental computing algorithm, our algorithm not only ensures the accuracy of calculating the PageRank in a large streaming graph, but also speeds up the computational process by avoiding many redundant computations.

Highlights

  • To evaluate the importance of web pages, the concept of PageRank in Google was first introduced by Page and Brin [1]

  • It is mostly a small-scale change at every moment of attention compared with the whole streaming graph, e.g., the total number of articles added was less than 4% for English Wikipedia in 2019 [4]

  • This paper focuses on all the local change information about node or edge insertion or deletion in a streaming graph, and designs an algorithm based on the idea behind wave propagation theory to find the all nodes affected by such local changes

Read more

Summary

INTRODUCTION

To evaluate the importance of web pages, the concept of PageRank in Google was first introduced by Page and Brin [1]. The fixed PageRank in dynamic streaming graphs cannot always be always valid because their structures change, especially when the addition and deletion of nodes or edges occurs. In this situation, PageRank needs to be updated after dynamic changes [3]. To the best of our knowledge, these existing incremental methods still have some shortcomings in identifying the nodes that need to update their PageRank, which makes it difficult to regulate a trade-off between improving computing efficiency and minimizing the calculation error. We use the idea of a random walk and design a novel incremental computation algorithm for updating PageRank in a dynamic streaming graph.

RELATED WORK
TRADITIONAL METHOD TO CALCULATE
THE MONTE CARLO IDEA TO CALCULATE
PAGERANK UPDATE PROBLEM
AFFECTED AREA DUE TO GRAPH CHANGES
WAVE PROPAGATION PHENOMENON
FINDING AFFECTED NODES FOR UPDATING
UPDATING THE PAGERANK OF THE AFFECTED
CALCULATING THE PAGERANK OF NEW NODES IN
COMPREHENSIVE ALGORITHM FOR ALL NODES
EXPERIMENTAL ENVIRONMENT
EXPERIMENTS AND ANALYSIS
Findings
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.