An FPGA-Based LSTM Acceleration Engine for Deep Learning Frameworks

Yang Yang,Yang Yang,Dazhong He,Jun Liu,Qing Yan,Junhua He

doi:10.3390/electronics10060681

Yang Yang, Yang Yang + Show 4 more

Open Access

https://doi.org/10.3390/electronics10060681

Copy DOI

Journal: Electronics	Publication Date: Mar 14, 2021
Citations: 16	License type: CC BY 4.0

Affiliation: Beijing University of Posts and Telecommunications

Abstract

Over the past two decades, Long Short-Term Memory (LSTM) networks have been used to solve problems that require modeling of long sequence because they can selectively remember certain patterns over a long period, thus outperforming traditional feed-forward neural networks and Recurrent Neural Network (RNN) on learning long-term dependencies. However, LSTM is characterized by feedback dependence, which limits the high parallelism of general-purpose processors such as CPU and GPU. Besides, in terms of the energy efficiency of data center applications, the high consumption of GPU and CPU computing cannot be ignored. To deal with the above problems, Field Programmable Gate Array (FPGA) is becoming an ideal alternative. FPGA has the characteristics of low power consumption and low latency, which are helpful for the acceleration and optimization of LSTM and other RNNs. This paper proposes an implementation scheme of the LSTM network acceleration engine based on FPGA and further optimizes the implementation through fixed-point arithmetic, systolic array and lookup table for nonlinear function. On this basis, for easy deployment and application, we integrate the proposed acceleration engine into Caffe, one of the most popular deep learning frameworks. Experimental results show that, compared with CPU and GPU, the FPGA-based acceleration engine can achieve performance improvement of 8.8 and 2.2 times and energy efficiency improvement of 16.9 and 9.6 times, respectively, within Caffe framework.

Highlights

As one of the most difficult problems in data science, sequence prediction, such as speech recognition [1] and language understanding [2], has been around for a long time
To overcome the challenges of computing and energy efficiency, we propose an FPGAbased Long Short-Term Memory (LSTM) acceleration engine and integrate it into the Caffe framework [8] to make the LSTM network easier to deploy
To take advantage of the convenience and flexibility of Caffe, we propose a general LSTM acceleration system based on Caffe, which is co-operated by CPU and Field Programmable Gate Array (FPGA)

Summary

Introduction

As one of the most difficult problems in data science, sequence prediction, such as speech recognition [1] and language understanding [2], has been around for a long time. With the technical breakthrough of data science, especially deep learning networks, the LSTM network [3] has gradually become an effective solution that can solve almost all sequence problems. LSTMs are widely used in many sequence modeling tasks, including many natural language processing tasks. Similar to other deep learning networks, the model size of the LSTM network is constantly increasing to improve its accuracy. In addition to the high power consumption of GPU, the inherent recurrent characteristics of LSTM become the main bottleneck of parallel processing on GPU

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An FPGA-Based LSTM Acceleration Engine for Deep Learning Frameworks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

An LSTM Acceleration Engine for FPGAs Based on Caffe Framework
Junhua He ... Yang Yang
-
Junhua He, et. al.Junhua He ... Yang Yang
01 Dec 2019
01 Dec 2019

Acceleration of LSTM With Structured Pruning Method on FPGA
Jin He ... Hao Wang
IEEE Access | VOL. 7
Jin He, et. al.Jin He ... Hao Wang
01 Jan 2019
IEEE Access | VOL. 7

Advanced Recurrent Neural Networks for Automatic Speech Recognition
Yu Zhang ... Dong Yu
-
Yu Zhang, et. al.Yu Zhang ... Dong Yu
01 Jan 2017
01 Jan 2017

Using LSTM recurrent neural networks to predict excess vibration events in aircraft engines
Abdelrahman Elsaid ... Travis Desell
-
Abdelrahman Elsaid, et. al.Abdelrahman Elsaid ... Travis Desell
01 Oct 2016
01 Oct 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An FPGA-Based LSTM Acceleration Engine for Deep Learning Frameworks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics