An Effective 2-Dimension Graph Partitioning for Work Stealing Assisted Graph Processing on Multi-FPGAs

Fan Zhang,Xiaofei Liao,Xinqiao Lv,Hai Jin,Long Zheng,Jiang Xiao

doi:10.1109/tbdata.2020.3035090

Abstract

Multi-FPGA architectures have gained great interests in accelerating large-scale graph processing with great potential on high throughput and energy efficiency. As a beneficial complement, work stealing functions effectively to balance the computational workload on different FPGAs dynamically. Unfortunately, existing graph partitioning schemes originally designed in distributed settings potentially mismatch the work stealing-enabled multi-FPGA situations, where the computation is balanced while the communication overhead is unprecedentedly significant. In this paper, we present a 2-dimension balanced graph partitioning for work stealing assisted graph systems on multi-FPGAs, which can reduce communication overhead while preserving the optimal performance of work stealing. Our approach is novel by 1) exploring the tradeoff between load balance dimension and communication dimension in work-stealing-enabled graph processing system for the optimal performance, and 2) optimizing the memory access sequences to improve the granularity of graph partitioning for high-throughput graph analytics. Our experimental results show that our approach achieves 1.63x <inline-formula><tex-math notation="LaTeX">$\sim$</tex-math></inline-formula> 2.56x speedups compared with state-of-the-art FPGA-based graph processing systems.

Full Text