Cache-Efficient Approach for Index-Free Personalized PageRank

Kohei Tsuchida,Kunitake Kaneko,Andrew Shin,Naoki Matsumoto

doi:10.1109/access.2023.3237738

Abstract

Personalized PageRank (PPR) measures the importance of vertices with respect to a source vertex. Since real-world graphs are evolving rapidly, PPR computation methods need to be index-free and fast. Unfortunately, existing index-free methods suffer from cache misses. They follow the state-of-the-art algorithm that first performs the Forward Push (FP) phase and subsequently runs the random walk Monte-Carlo simulation (MC) phase. Although existing methods succeed in reducing cache misses in the FP phase, an inefficient data layout limits their performance improvement. Besides, existing methods have overlooked the importance of reducing cache misses in the MC phase. In this paper, we propose a cache-efficient approach that accelerates both FP and MC phases. In the FP phase, we first reorder the data layout with low overheads. Specifically, we utilize the Breadth First Search result so that vertices near the source vertex are co-located on the reordered data layout. We subsequently perform optimized FP, namely Distance-Extension Forward Push (DEFP). By preferentially proceeding FP around the source vertex, DEFP improves memory access locality. In the MC phase, we perform optimized MC, namely Vertex-Centric Random Walk (VCRW). VCRW aggregates random walks at each vertex to eliminate redundant memory access for repeatedly obtaining neighbor vertices. We prove that most of the random walks can be aggregated while maintaining accuracy guarantees. Experimental results show that the proposed method is up to 4.7x faster than existing index-free methods and outperforms the state-of-the-art index-oriented method under rigorous accuracy guarantees.

Full Text