FTW-GAT: An FPGA-Based Accelerator for Graph Attention Networks With Ternary Weights

Zerong He,Teng Tian,Qizhe Wu,Xi Jin

doi:10.1109/tcsii.2023.3280180

Abstract

Graph attention networks (GATs) are a mainstream graph neural network (GNN) model. They have better performance in some tasks compared to other GNN models. The challenge is that graph data structures are irregular and data dependency in GATs is complex. General-purpose hardware cannot provide enough performance or energy efficiency. Therefore, a specialized accelerator for GATs is needed. In this brief, we propose FTW-GAT to accelerate GAT inference. The key idea of our approach is to quantize the weights of GATs to ternary values, which can greatly simplify processing elements (PE), eliminate the dependence on digital signal processors (DSPs) and reduce power consumption. Then, we use operation fusion, multi-level pipelining and graph partitioning to improve the parallelism. Finally, we implement the accelerator on a Xilinx VCU128 FPGA platform. The results show that FTW-GAT achieves performance speedup by 390×, 17× and 1.4×, and energy efficiency by 4007×, 261× and 3.1× compared to CPUs, GPUs and a prior GAT accelerator.

Full Text