Abstract

The convolutional neural network (CNN) has been widely applied in computer vision applications. Due to the intensive computation, the general central processing unit (CPU) processors are not efficient to meet the real-time requirement. Various hardware accelerators based on the application specific integrated circuit (ASIC) and field programmable gate array (FPGA) have been designed to accelerate CNN models during the inference phase. The data flow architecture has been extensively adopted because of the high parallelism. However, given the continual development in the computer vision field, CNN models have become increasingly diverse. Thus, CNN accelerators based on data flow architectures face an emerging challenge to maintain high throughput while coping with various CNN models. In this paper, we design a software-defined architecture to solve this. The goal of this study is to make the hardware change as the application changes to achieve high flexibility and high performance. In our proposed architecture, all the parts can be software-defined to cope with different CNN models. A flexible software-defined processing element (PE) array is designed to compute different weight filter sizes. In addition, a software-defined data reuse technique based on two ideal reuse cases is proposed to ensure that all the parameters need to be loaded only once during the computing phase. To support this reuse technique, we also propose the software-defined on-chip buffer so that the weight and image buffers share one dynamic buffer. By using the sparsity property of the input feature map, the full-connected (FC) layer is accelerated. About 88% of the FC weight parameters can be skipped when loading the VGG-16 model. Finally, we implemented this software-defined accelerator on the FPGA. Compared to the other FPGA based accelerators, our proposed accelerator can preserve high performance while maintaining flexibility.

Highlights

  • In recent years, the convolutional neural network (CNN) has emerged as the prevalent model for the machine learning and computer vision

  • We focus on the inference phase, which is widely used in embedded vision system

  • Similar to the reconfiguration architecture, we propose a software-defined architecture that can maintain high throughput when dealing with various CNN models

Read more

Summary

Introduction

The convolutional neural network (CNN) has emerged as the prevalent model for the machine learning and computer vision. We designed a software-defined architecture for CNN accelerators. B. SOFTWARE-DEFINED CONVOLUTIONAL PE ARRAY The processing element (PE) architecture which determines flexibility and throughput plays a key role in CNN accelerators.

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.