Abstract

This study presents a resource-efficient reconfigurable inference processor for recurrent neural networks (RNN), named AERO. AERO is programmable to perform inference on RNN models of various types. This was designed based on the instruction-set architecture specializing in processing primitive vector operations that compose the dataflows of RNN models. A versatile vector-processing unit (VPU) was incorporated to perform every vector operation and achieve a high resource efficiency. Aiming at a low resource usage, the multiplication in VPU is carried out on the basis of an approximation scheme. In addition, the activation functions are realized with the reduced tables. We developed a prototype inference system based on AERO using a resource-limited field-programmable gate array, under which the functionality of AERO was verified extensively for inference tasks based on several RNN models of different types. The resource efficiency of AERO was found to be as high as 1.28 MOP/s/LUT, which is 1.3-times higher than the previous state-of-the-art result.

Highlights

  • This study presents a resource-efficient reconfigurable inference processor for recurrent neural networks (RNN), named AERO

  • The functionality of AERO was verified successfully by programming it to perform inference tasks based on the various RNN models listed in Tables 5 and 6 for the sequential

  • The ALUT counts can be obtained from the ALM [24] counts considering the number of the ALUTs in each ALM in the target devices. e This result corresponds to the BRAM instances for implementing activation memory (AM), weight memory (WM), bias memory (BM), and instruction memory (IM), which are associated directly with AERO.f The number inside the parentheses corresponds to the result of AERO itself, while the number outside the parentheses corresponds to that of the entire system

Read more

Summary

Introduction

Recurrent neural networks (RNN) are a class of artificial neural networks whose dataflows have feedback connections Such recurrent dataflows enable inference to be performed in a stateful manner that is based on the current and past inputs, thereby, recognizing the temporal characteristics [1]. An efficient architecture to perform the GRU inference was presented based on the modified model exploiting the temporal sparsity [17]. AERO is an instruction-set processor that can be programmed to perform RNN inference based on models of various types, where its instruction-set architecture (ISA) is formulated to efficiently perform the common primitive vector operations composing the dataflows of the models.

Dataflow of RNN Inference
RNN-Specific Instruction-Set Architecture
Processing Pipeline
Vector Processing Unit Based on the Approximate Multipliers
Activation Coefficient Unit Based on the Reduced Tables
Prototype Inference System
Results and Evaluation
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call