WinDConv: A Fused Datapath CNN Accelerator for Power-Efficient Edge Devices

Gopinath Mahale,Pramod Udupa,Kiran Kolar Chandrasekharan,Sehwan Lee

doi:10.1109/tcad.2020.3013096

Abstract

Diverse applications of Deep convolution neural networks (CNNs), such as image classification, semantic segmentation, video recognition, etc., in smart systems require high-throughput acceleration for real-time performance. Such CNNs when realized on edge devices of the Internet of Things, a power/energy-efficient compute platform is required, which can meet the limited power/energy budget of the devices. In this regard, an end-to-end power-optimized acceleration for the compute-intensive CNNs is proposed in this work. The proposed architecture, termed WinDConv, introduces a scheme to support both regular and energy-efficient Winograd convolutions on the same architecture through a fused datapath. Furthermore, using a thoroughly investigated data sparsity enhancement, the data reuse scheme, and a suitable memory hierarchy for power efficiency, the proposed architecture is able to exhibit a practical average power efficiency of at least 12.35 tera operations per second per Watt, which is at least 2× higher than the generic z-first storage baseline architecture with over 3× higher energy efficiency. The proposed architecture also demonstrates the applicability of the proposed schemes in commonly occurring variants of the convolution operation.

Full Text