An Adaptive Hardware Accelerator For Convolution Layers With Diverse Sizes

Zhao Yiwei,Zou Hao,Tang Ming,Lin Qiutong

doi:10.1109/iccwamtip56608.2022.10016562

Abstract

Convolution is the most important operation in convolutional neural networks (CNN). FPGA-based CNN accelerators need to fully consider the optimization of convolution loops to get ideal performance. This work analyzes convolution loop optimization in detail, exploiting loop tiling, loop unrolling, and loop interchange to design the dataflow of accelerator. This work quantitatively evaluates strategies for data reuse and resource utilization, combining fixed and dynamic parallelism to design a high-performance adaptive accelerator. The proposed accelerator is evaluated on ZCU102 FPGA by implementing a five-layer CNN with large differences in convolution layer sizes. It achieves more than 1.14x improvement in throughput efficiency over prior accelerators. And the consumption of logic resources is less than half of prior accelerators while the computing resources are similar.

Full Text