E-BATCH: Energy-Efficient and High-Throughput RNN Batching

Franyell Silfa,Antonio González,Jose Maria Arnau

doi:10.1145/3499757

Abstract

Recurrent Neural Network (RNN) inference exhibits low hardware utilization due to the strict data dependencies across time-steps. Batching multiple requests can increase throughput. However, RNN batching requires a large amount of padding since the batched input sequences may vastly differ in length. Schemes that dynamically update the batch every few time-steps avoid padding. However, they require executing different RNN layers in a short time span, decreasing energy efficiency. Hence, we propose E-BATCH, a low-latency and energy-efficient batching scheme tailored to RNN accelerators. It consists of a runtime system and effective hardware support. The runtime concatenates multiple sequences to create large batches, resulting in substantial energy savings. Furthermore, the accelerator notifies it when the evaluation of an input sequence is done. Hence, a new input sequence can be immediately added to a batch, thus largely reducing the amount of padding. E-BATCH dynamically controls the number of time-steps evaluated per batch to achieve the best trade-off between latency and energy efficiency for the given hardware platform. We evaluate E-BATCH on top of E-PUR and TPU. E-BATCH improves throughput by 1.8× and energy efficiency by 3.6× in E-PUR, whereas in TPU, it improves throughput by 2.1× and energy efficiency by 1.6×, over the state-of-the-art.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

E-BATCH: Energy-Efficient and High-Throughput RNN Batching

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization

Lead the way for us

Journal: ACM Transactions on Architecture and Code Optimization	Publication Date: Jan 23, 2022
Citations: 5

Similar Papers

Editor's evaluation: Neural population dynamics of computing with synaptic modulations
Gianluigi Mongillo
-
Gianluigi MongilloGianluigi Mongillo
08 Jan 2023
08 Jan 2023

Author response: Neural population dynamics of computing with synaptic modulations
Kyle Aitken ... Stefan Mihalas
-
Kyle Aitken, et. al.Kyle Aitken ... Stefan Mihalas
10 Feb 2023
10 Feb 2023

Decision letter: Neural population dynamics of computing with synaptic modulations
Omri Barak ... Joshua I Gold
-
Omri Barak, et. al.Omri Barak ... Joshua I Gold
08 Jan 2023
08 Jan 2023

Deep Recurrent Convolutional Neural Network for Remaining Useful Life Prediction
Meng Ma ... Zhu Mao
-
Meng Ma, et. al.Meng Ma ... Zhu Mao
01 Jun 2019
01 Jun 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

E-BATCH: Energy-Efficient and High-Throughput RNN Batching

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization