Accelerating Depthwise Convolution and Pooling Operations on z-First Storage CNN Architectures

Pramod Udupa,Gopinath Mahale,Kiran Kolar Chandrasekharan,Sehwan Lee

doi:10.1109/iscas45731.2020.9180863

Abstract

Majority of the present day applications in computer vision use Convolutional neural networks (CNN) as the preferred engine of computation. For hardware acceleration of CNNs, z-first storage architectures like NVDLA have gained prominence which show better resource utilization and effective throughput for 3-D convolution operation. However, these architectures have low effective throughput for 2-D operations such as Depthwise Convolution (DWC) and Pooling due to poor resource utilization. The current trend in CNNs show that DWC has gained prominence in compact mobile-first CNNs like MnasNet, EfficientNet etc., which have classification accuracies comparable to state-of-the-art complex CNNs. In this work, an accelerator for accelerating 2-D operations based on z-first storage architectures is proposed. Proposed accelerator delivers power efficiency improvement of at least 1.46× and 1.89× for DWC and Pooling respectively over the baseline architecture, while improving effective throughput by 4×. Proposed IFM reuse scheme also reduces IFM reads from memory at least by a factor of 1.72×.

Full Text