A Novel Software-Defined Convolutional Neural Networks Accelerator

Yufeng Li,Yankang Du

doi:10.1109/access.2019.2956841

Abstract

The convolutional neural network (CNN) has been widely applied in computer vision applications. Due to the intensive computation, the general central processing unit (CPU) processors are not efficient to meet the real-time requirement. Various hardware accelerators based on the application specific integrated circuit (ASIC) and field programmable gate array (FPGA) have been designed to accelerate CNN models during the inference phase. The data flow architecture has been extensively adopted because of the high parallelism. However, given the continual development in the computer vision field, CNN models have become increasingly diverse. Thus, CNN accelerators based on data flow architectures face an emerging challenge to maintain high throughput while coping with various CNN models. In this paper, we design a software-defined architecture to solve this. The goal of this study is to make the hardware change as the application changes to achieve high flexibility and high performance. In our proposed architecture, all the parts can be software-defined to cope with different CNN models. A flexible software-defined processing element (PE) array is designed to compute different weight filter sizes. In addition, a software-defined data reuse technique based on two ideal reuse cases is proposed to ensure that all the parameters need to be loaded only once during the computing phase. To support this reuse technique, we also propose the software-defined on-chip buffer so that the weight and image buffers share one dynamic buffer. By using the sparsity property of the input feature map, the full-connected (FC) layer is accelerated. About 88% of the FC weight parameters can be skipped when loading the VGG-16 model. Finally, we implemented this software-defined accelerator on the FPGA. Compared to the other FPGA based accelerators, our proposed accelerator can preserve high performance while maintaining flexibility.

Highlights

In recent years, the convolutional neural network (CNN) has emerged as the prevalent model for the machine learning and computer vision
We focus on the inference phase, which is widely used in embedded vision system
Similar to the reconfiguration architecture, we propose a software-defined architecture that can maintain high throughput when dealing with various CNN models

Summary

Introduction

The convolutional neural network (CNN) has emerged as the prevalent model for the machine learning and computer vision. We designed a software-defined architecture for CNN accelerators. B. SOFTWARE-DEFINED CONVOLUTIONAL PE ARRAY The processing element (PE) architecture which determines flexibility and throughput plays a key role in CNN accelerators.

Objectives

Results

Conclusion