Abstract

Modern deep convolutional neural networks (CNNs) suffer from high computational complexity due to excessive convolution operations. Recently, fast convolution algorithms such as fast Fourier transform (FFT) and Winograd transform have gained attention to address this problem. They reduce the number of multiplications required in the convolution operation by replacing it with element-wise multiplication in the transform domain. However, fast convolution-based CNN accelerators have three major concerns: expensive domain transform, large memory overhead, and limited flexibility in kernel size. In this paper, we present a novel CNN accelerator based on number theoretic transform (NTT), which overcomes the existing limitations. We propose the low-cost NTT and inverse-NTT converter that only use adders and shifters for on-chip domain transform, which solves the inflated bandwidth problem and enables more parallel computations in the accelerator. We also propose the accelerator architecture that includes multiple tile engines with the optimized data flow and mapping. Finally, we implement the proposed NTT-based CNN accelerator on the Xilinx Alveo U50 FPGA and evaluate it for popular deep CNN models. As a result, the proposed accelerator achieves 2859.5, 990.3, and 805.6 GOPS throughput for VGG-16, GoogLeNet, and Darknet-19, respectively. It outperforms the existing fast convolution-based CNN accelerators up to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$9.6\times $ </tex-math></inline-formula> .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.