Abstract

To meet the changing real-time edge engineering application requirements of CNN, aiming at the lack of universality and flexibility of CNN hardware acceleration architecture based on ARM+FPGA, a general low-power all pipelined CNN hardware acceleration architecture is proposed to cope with the continuously updated CNN algorithm and accelerate in hardware platforms with different resource constraints. In the framework of the general hardware architecture, a basic instruction set belonging to the architecture is proposed, which can be used to calculate and configure different versions of CNN algorithms. Based on the instruction set, the configurable computing subsystem, memory management subsystem, on-chip cache subsystem, and instruction execution subsystem are designed and implemented. In addition, in the processing of convolution results, the on-chip storage unit is used to preprocess the convolution results, to speed up the activation and pooling calculation process in parallel. Finally, the accelerator is modeled at the RTL level and deployed on the XC7Z100 heterogeneous device. The lightweight networks YOLOv2-tiny and YOLOv3-tiny commonly used in engineering applications are verified on the accelerator. The results show that the peak performance of the accelerator reaches 198.37 GOP/s, the clock frequency reaches 210 MHz, and the power consumption is 4.52 w under 16-bit width.

Highlights

  • With the development of deep convolutional neural networks, various new convolutional neural network algorithm structures emerge endlessly

  • This paper focuses on creating a reconfigurable deep convolutional neural network accelerator under the edge engineering application scenario, using the deterministic delay and reconfigurability of FPGA, and named it deep-sea

  • This research is dedicated to applying this general convolutional neural network acceleration core to specific real-time engineering processing, so the YOLO series of algorithms with high real-time performance are selected for the verification of the acceleration core

Read more

Summary

Introduction

With the development of deep convolutional neural networks, various new convolutional neural network algorithm structures emerge endlessly. In some areas where the development is lagging, such as in the more special deepsea exploration field, due to the lag in the development of image processing and analysis technologies and computing platforms for landers, AUVs, ROVs, and live video landers in deep-sea scenes, at present, most signal processing is still based on manual control and centralized calculation of surface mother ship, and acoustic information is mainly used as the main information source of environmental perception The existence of these problems, on the one hand, makes the optical image information unable to directly serve the environment perception of underwater equipment, which limits the intelligent process of underwater equipment; on the other hand, the transmission bandwidth is occupied by a large amount of invalid data, which limits the three-dimensional deployment of the detection system in a large area and a wide range. Research on network lightweighting is developing rapidly, such

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call