Deep-Sea: A Reconfigurable Accelerator for Classic CNN

Hao Xiong,Huiping Xu,Bing Zhang,Kelin Sun,Jingchuan Yang

doi:10.1155/2022/4726652

Abstract

To meet the changing real-time edge engineering application requirements of CNN, aiming at the lack of universality and flexibility of CNN hardware acceleration architecture based on ARM+FPGA, a general low-power all pipelined CNN hardware acceleration architecture is proposed to cope with the continuously updated CNN algorithm and accelerate in hardware platforms with different resource constraints. In the framework of the general hardware architecture, a basic instruction set belonging to the architecture is proposed, which can be used to calculate and configure different versions of CNN algorithms. Based on the instruction set, the configurable computing subsystem, memory management subsystem, on-chip cache subsystem, and instruction execution subsystem are designed and implemented. In addition, in the processing of convolution results, the on-chip storage unit is used to preprocess the convolution results, to speed up the activation and pooling calculation process in parallel. Finally, the accelerator is modeled at the RTL level and deployed on the XC7Z100 heterogeneous device. The lightweight networks YOLOv2-tiny and YOLOv3-tiny commonly used in engineering applications are verified on the accelerator. The results show that the peak performance of the accelerator reaches 198.37 GOP/s, the clock frequency reaches 210 MHz, and the power consumption is 4.52 w under 16-bit width.

Highlights

With the development of deep convolutional neural networks, various new convolutional neural network algorithm structures emerge endlessly
This paper focuses on creating a reconfigurable deep convolutional neural network accelerator under the edge engineering application scenario, using the deterministic delay and reconfigurability of FPGA, and named it deep-sea
This research is dedicated to applying this general convolutional neural network acceleration core to specific real-time engineering processing, so the YOLO series of algorithms with high real-time performance are selected for the verification of the acceleration core

Summary

Introduction

With the development of deep convolutional neural networks, various new convolutional neural network algorithm structures emerge endlessly. In some areas where the development is lagging, such as in the more special deepsea exploration field, due to the lag in the development of image processing and analysis technologies and computing platforms for landers, AUVs, ROVs, and live video landers in deep-sea scenes, at present, most signal processing is still based on manual control and centralized calculation of surface mother ship, and acoustic information is mainly used as the main information source of environmental perception The existence of these problems, on the one hand, makes the optical image information unable to directly serve the environment perception of underwater equipment, which limits the intelligent process of underwater equipment; on the other hand, the transmission bandwidth is occupied by a large amount of invalid data, which limits the three-dimensional deployment of the detection system in a large area and a wide range. Research on network lightweighting is developing rapidly, such

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Wireless Communications and Mobile Computing	Publication Date: Feb 2, 2022
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Deep-Sea: A Reconfigurable Accelerator for Classic CNN

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wireless Communications and Mobile Computing

Lead the way for us

Similar Papers

Power Reduction Techniques and Flows at RTL and System Level
Anmol Mathur ... Qi Wang
-
Anmol Mathur, et. al.Anmol Mathur ... Qi Wang
01 Jan 2009
01 Jan 2009

Duty-cycle optimization for IEEE 802.15.4 wireless sensor networks
Pangun Park ... Carlo Fischione
ACM Transactions on Sensor Networks | VOL. 10
Pangun Park, et. al.Pangun Park ... Carlo Fischione
01 Nov 2013
Duty-cycle optimization for IEEE 802.15.4 wireless sensor networks
Pangun Park ... Carlo Fischione

Enhancement and Overlap in the Speech Chain
Samuel Jay Keyser ... Kenneth N Stevens
Language | VOL. 82
Samuel Jay Keyser, et. al.Samuel Jay Keyser ... Kenneth N Stevens
01 Mar 2006
Language | VOL. 82

An OFDMA PHY System on Chip Design Methodology
Trio Adiono
-
Trio AdionoTrio Adiono
01 Jan 2014
01 Jan 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep-Sea: A Reconfigurable Accelerator for Classic CNN

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wireless Communications and Mobile Computing