Abstract

In this brief, a low resource utilization field-programmable gate array (FPGA)-based long short-term memory (LSTM) network architecture for accelerating the inference phase is presented. The architecture has low-power and high-speed features that are achieved through overlapping the timing of the operations and pipelining the datapath. Moreover, this architecture requires negligible internal memory size for storing the intermediate data leading to low resource utilization and simple routing, which provides lower interconnect delay (higher operating frequency). A designer may adjust the resource utilization (as well as the latency) of the proposed architecture readily at the register-transfer level (RTL) design by adjusting the amount of parallelization. This makes the process of mapping the architecture onto different types of FPGAs, subject to defined constraints, a simple one. The efficacy of the proposed architecture is assessed by implementing an LSTM network on different types of FPGAs. Compared with the recent works, the proposed architecture provides up to about $1.6\times $ , $43.6\times $ , $21.9\times $ , and $114.5\times $ improvements in frequency, power efficiency, GOP/s, and GOP/s/W, respectively. Finally, our proposed architecture operates at 17.64 GOP/s, which is $2.31\times $ faster than the best previously reported results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call