A General-Purpose CNN Accelerator Based on Improved Systolic Array for FPGAs

Chengyu Yang,Yuan Yang,Lei Huang,Wei Yang,Yaohua Li

doi:10.1109/icicm56102.2022.10011386

Abstract

In recent years, convolutional neural networks have been increasingly used for processing image tasks, such as target recognition, image enhancement, and other areas. In the embedded and IoT fields, the small size and low-power application characteristics make it impractical to deploy high-performance computing devices on hardware platforms in this field. In this paper, a general neural network accelerator is designed that can be scaled to any size, which can be adapted to hardware platforms in different application areas including data centers and IoT according to power consumption and arithmetic power requirements. The accelerator is based on the improved systolic array, which improves the utilization and throughput of data and reduces the system power consumption. This paper uses int8 data type to match embedded and IoT low-power requirements. In this paper, the accelerator is deployed on the XC7Z020 hardware platform, compared with CPU and GPU platforms, and evaluate the performance of handwritten digit recognition neural network based on the Minist dataset. The experimental results show that the energy consumption ratio of the design in this paper is 10 times better compared to current advanced CPUs and GPUs, and more than 300 times better compared to current CPUs in the embedded domain.

Full Text