FARNN: FPGA-GPU Hybrid Acceleration Platform for Recurrent Neural Networks

Hyungmin Cho,Jaejin Lee,Jeesoo Lee

doi:10.1109/tpds.2021.3124125

Abstract

GPU-based platforms provide high computation throughput for large mini-batch deep neural network computations. However, a large batch size may not be ideal for some situations, such as aiming at low latency, training on edge/mobile devices, partial retraining for personalization, and having irregular input sequence lengths. GPU performance suffers from low utilization especially for small-batch recurrent neural network (RNN) applications where sequential computations are required. In this article, we propose a hybrid architecture, called FARNN, which combines a GPU and an FPGA to accelerate RNN computation for small batch sizes. After separating RNN computations into GPU-efficient and GPU-inefficient tasks, we design special FPGA computation units that accelerate the GPU-inefficient RNN tasks. FARNN off-loads the GPU-inefficient tasks to the FPGA. We evaluate FARNN with synthetic RNN layers of various configurations on the Xilinx UltraScale+ FPGA and the NVIDIA P100 GPU in addition to evaluating it with real RNN applications. The evaluation result indicates that FARNN outperforms the P100 GPU platform for RNN training by up to 4.2 <inline-formula><tex-math notation="LaTeX">$\times {}$</tex-math></inline-formula> with small batch sizes, long input sequences, and many RNN cells per layer.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

FARNN: FPGA-GPU Hybrid Acceleration Platform for Recurrent Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Jul 1, 2022
Citations: 11

Similar Papers

Efficient Dual Batch Size Deep Learning for Distributed Parameter Server Systems
Kuan-Wei Lu ... Ding-Yong Hong
-
Kuan-Wei Lu, et. al.Kuan-Wei Lu ... Ding-Yong Hong
01 Jun 2022
01 Jun 2022

Editor's evaluation: Neural population dynamics of computing with synaptic modulations
Gianluigi Mongillo
-
Gianluigi MongilloGianluigi Mongillo
08 Jan 2023
08 Jan 2023

Author response: Neural population dynamics of computing with synaptic modulations
Kyle Aitken ... Stefan Mihalas
-
Kyle Aitken, et. al.Kyle Aitken ... Stefan Mihalas
10 Feb 2023
10 Feb 2023

Decision letter: Neural population dynamics of computing with synaptic modulations
Omri Barak ... Joshua I Gold
-
Omri Barak, et. al.Omri Barak ... Joshua I Gold
08 Jan 2023
08 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FARNN: FPGA-GPU Hybrid Acceleration Platform for Recurrent Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems