ScalaBFS: A Scalable BFS Accelerator on FPGA-HBM Platform

Chenhao Liu,Zhiyuan Shao,Minkang Wu,Xiaofei Liao,Hai Jin,Ruoshi Li,Kexin Li,Jiajie Chen

doi:10.1145/3431920.3439463

Abstract

High Bandwidth Memory (HBM) provides massive aggregated memory bandwidth by exposing multiple memory channels to the processing units. To achieve high performance, an accelerator built on top of an FPGA configured with HBM (i.e., FPGA-HBM platform) needs to scale its performance according to the available memory channels. In this paper, we propose an accelerator for BFS (Breadth-First Search), named as ScalaBFS, which decouples memory accessing from processing to scale its performance with available HBM memory channels. Moreover, by configuring each HBM memory channel with multiple processing elements, ScalaBFS sufficiently exploits the memory bandwidth of HBM. We implement the prototype system of ScalaBFS and conduct BFS in both real-world and synthetic scale-free graphs on Xilinx Alveo U280 Data Center Accelerator card (real hardware). The experimental results show that ScalaBFS scales its performance almost linearly according to the available memory pseudo channels (PCs) from the HBM2 subsystem of U280. By fully using the 32 PCs and building 64 processing elements (PEs) on U280, ScalaBFS achieves a performance up to 19.7 GTEPS (Giga Traversed Edges Per Second). When conducting BFS in sparse real-world graphs, ScalaBFS achieves equivalent GTEPS to Gunrock running on the state-of-art Nvidia V100 GPU that features 64-PC HBM2 (twice memory bandwidth than U280).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

ScalaBFS: A Scalable BFS Accelerator on FPGA-HBM Platform

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

ScalaBFS2: A High-performance BFS Accelerator on an HBM-enhanced FPGA Chip
Kexin Li ... Zhiyuan Shao
ACM Transactions on Reconfigurable Technology and Systems | VOL. 17
Kexin Li, et. al.Kexin Li ... Zhiyuan Shao
30 Apr 2024
ACM Transactions on Reconfigurable Technology and Systems | VOL. 17

Scheduling Memory Access Optimization for HBM Based on CLOS
Shuang Xue ... Qizhe Wu
-
Shuang Xue, et. al.Shuang Xue ... Qizhe Wu
19 Feb 2023
19 Feb 2023

IPUG: Accelerating Breadth-First Graph Traversals Using Manycore Graphcore IPUs
Luk Burchard ... Konstantin Pogorelov
-
Luk Burchard, et. al.Luk Burchard ... Konstantin Pogorelov
01 Jan 2020
01 Jan 2020

GraphScale: Scalable Processing on FPGAs for HBM and Large Graphs
Jonas Dann ... Holger Fröning
ACM Transactions on Reconfigurable Technology and Systems | VOL. 17
Jonas Dann, et. al.Jonas Dann ... Holger Fröning
13 Mar 2024
ACM Transactions on Reconfigurable Technology and Systems | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ScalaBFS: A Scalable BFS Accelerator on FPGA-HBM Platform

Abstract

Talk to us

Similar Papers