Abstract

Prioritized computation is shown promising performance for a large class of graph algorithms. It prioritizes the execution of some vertices that play important roles in determining convergence. For large-scale distributed graph processing, graph partitioning is an important preprocessing step that aims to balance workload and to reduce communication costs between workers. However, existing graph partitioning methods are designed for round-robin synchronous distributed frameworks. They balance workload without distinction of vertex importance and fail to consider the characteristics of priority-based scheduling, which may limit the benefit of prioritized graph computation. In this article, to accelerate prioritized iterative graph computations, we propose Hotness Balanced Partition (HBP). In prioritized graph computation, high priority vertices are likely to be executed more frequently and are likely to pass more messages, which result in hot vertices. Based on this observation, we partition graph by distributing vertices with distinction according to their hotness rather than blindly distributing vertices with equal weights, aiming to evenly distribute the hot vertices among workers. We further provide two HBP algorithms: a streaming-based algorithm for efficient one-pass processing and a distributed algorithm for distributed processing. Our results show that our proposed partitioning methods outperform the state-of-the-art partitioning methods, Fennel, HotGraph, and SNE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call