A Heterogeneous Architecture for the Vision Processing Unit with a Hybrid Deep Neural Network Accelerator.

Peng Liu,Zikai Yang,Lin Kang,Jian Wang

doi:10.3390/mi13020268

Peng Liu, Zikai Yang + Show 2 more

Open Access

https://doi.org/10.3390/mi13020268

Copy DOI

Abstract

The vision chip is widely used to acquire and process images. It connects the image sensor directly with the vision processing unit (VPU) to execute the vision tasks. Modern vision tasks mainly consist of image signal processing (ISP) algorithms and deep neural networks (DNNs). However, the traditional VPUs are unsuitable for the DNNs, and the DNN processing units (DNPUs) cannot process the ISP algorithms. Meanwhile, only the CNNs and the CNN-RNN frameworks are used in the vision tasks, and few DNPUs are specifically designed for this. In this paper, we propose a heterogeneous architecture for the VPU with a hybrid accelerator for the DNNs. It can process the ISP, CNNs, and hybrid DNN subtasks on one unit. Furthermore, we present a sharing scheme to multiplex the hardware resources for different subtasks. We also adopt a pipelined workflow for the vision tasks to fully use the different processing modules and achieve a high processing speed. We implement the proposed VPU on the field-programmable gate array (FPGA), and several vision tasks are tested on it. The experiment results show that our design can process the vision tasks efficiently with an average performance of 22.6 giga operations per second/W (GOPS/W).

Highlights

Academic Editor: Arman RoohiThe vision chips have shown excellent performance on the vision tasks by connecting the image sensor directly with the parallel vision processing unit (VPU) [1,2,3]
Two types of deep neural networks (DNNs) were tested on the VPU for the vision application, including
The hybrid DNNs were processed on the VPU to verify the pipelined workflow

Summary

Introduction

The vision chips have shown excellent performance on the vision tasks by connecting the image sensor directly with the parallel vision processing unit (VPU) [1,2,3]. They can solve the bottlenecks of the massive image data transmission and processing in the vision tasks. VPUs in early works [2,5] were mainly composed of the arithmetic and logic unit (ALU) array They can accomplish the image signal processing (ISP) tasks and some computer vision algorithms like speed-up robust features (SURF) [6] at high speed. The hybrid neural networks [8,9] that combe CNNs and recurrent neural networks (RNNs) can be used for some specific applications such as image caption and video description [10,11]

Results

Discussion

Conclusion