Abstract

Majority of the present day applications in computer vision use Convolutional neural networks (CNN) as the preferred engine of computation. For hardware acceleration of CNNs, z-first storage architectures like NVDLA have gained prominence which show better resource utilization and effective throughput for 3-D convolution operation. However, these architectures have low effective throughput for 2-D operations such as Depthwise Convolution (DWC) and Pooling due to poor resource utilization. The current trend in CNNs show that DWC has gained prominence in compact mobile-first CNNs like MnasNet, EfficientNet etc., which have classification accuracies comparable to state-of-the-art complex CNNs. In this work, an accelerator for accelerating 2-D operations based on z-first storage architectures is proposed. Proposed accelerator delivers power efficiency improvement of at least 1.46× and 1.89× for DWC and Pooling respectively over the baseline architecture, while improving effective throughput by 4×. Proposed IFM reuse scheme also reduces IFM reads from memory at least by a factor of 1.72×.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call