Abstract

Diverse applications of Deep convolution neural networks (CNNs), such as image classification, semantic segmentation, video recognition, etc., in smart systems require high-throughput acceleration for real-time performance. Such CNNs when realized on edge devices of the Internet of Things, a power/energy-efficient compute platform is required, which can meet the limited power/energy budget of the devices. In this regard, an end-to-end power-optimized acceleration for the compute-intensive CNNs is proposed in this work. The proposed architecture, termed WinDConv, introduces a scheme to support both regular and energy-efficient Winograd convolutions on the same architecture through a fused datapath. Furthermore, using a thoroughly investigated data sparsity enhancement, the data reuse scheme, and a suitable memory hierarchy for power efficiency, the proposed architecture is able to exhibit a practical average power efficiency of at least 12.35 tera operations per second per Watt, which is at least 2× higher than the generic z-first storage baseline architecture with over 3× higher energy efficiency. The proposed architecture also demonstrates the applicability of the proposed schemes in commonly occurring variants of the convolution operation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call