Abstract

With the increasing use of multi-purpose artificial intelligence of things (AIOT) devices, embedded field-programmable gate arrays (FPGA) represent excellent platforms for deep neural network (DNN) acceleration on edge devices. FPGAs possess the advantages of low latency and high energy efficiency, but the scarcity of FPGA development resources challenges the deployment of DNN-based edge devices. Register-transfer level programming, hardware verification, and precise resource allocation are needed to build a high-performance FPGA accelerator for DNNs. These tasks present a challenge and are time consuming for even experienced hardware developers. Therefore, we propose an automated, collaborative design process employing an automatic design space exploration tool; an automatic DNN engine enables the tool to reshape and parse a DNN model from software to hardware. We also introduce a long short-term memory (LSTM)-based model to predict performance and generate a DNN model that suits the developer requirements automatically. We demonstrate our design scheme with three FPGAs: a zcu104, a zcu102, and a Cyclone V SoC (system on chip). The results show that our hardware-based edge accelerator exhibits superior throughput compared with the most advanced edge graphics processing unit.

Highlights

  • Recent studies have shown that field-programmable gate arrays (FPGA) are promising candidates for deep neural network (DNN) implementation [1,2,3,4,5]

  • A DNN can be integrated via hardware, rather than via an existing central processing unit (CPU) or graphics processing unit (GPU), improving latency and reducing energy consumption

  • This study proposes DNN implementation via an auto-generated automation tool, which maps the DNN design process from a deep learning framework to FPGA

Read more

Summary

Introduction

Recent studies have shown that field-programmable gate arrays (FPGA) are promising candidates for deep neural network (DNN) implementation [1,2,3,4,5]. A DNN can be integrated via hardware, rather than via an existing central processing unit (CPU) or graphics processing unit (GPU), improving latency and reducing energy consumption. These characteristics exemplify FPGAs for DNN-based applications in cloud and edge computing; as a result, FPGAs have been rapidly adopted for DNN acceleration. Internet of things (IOT) applications have stringent requirements in the fields of automatic driving, safety, and monitoring; complex DNN models must produce quality results with minimal delay and power consumption, while not exceeding resource constraints [6]. Development resource scarcity makes the design and deployment of FPGA DNN accelerators challenging

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call