DyNNamic: Dynamically Reshaping, High Data-Reuse Accelerator for Compact DNNs

Edward Hanson,Yiran Chen,Hai Helen Li,Shiyu Li,Xuehai Qian

doi:10.1109/tc.2022.3184272

Abstract

Convolutional layers dominate the computation and energy costs of Deep Neural Network (DNN) inference. Recent algorithmic works attempt to reduce these bottlenecks via compact DNN structures and model compression. Likewise, state-of-the-art accelerator designs leverage spatiotemporal characteristics of convolutional layers to reduce data movement overhead and improve throughput. Although both are independently effective at reducing latency and energy costs, combining these approaches does not guarantee cumulative improvements due to inefficient mapping. This inefficiency can be attributed to (1) inflexibility of underlying hardware and (2) inherent reduction of data-reuse opportunities of compact DNN structures. To address these issues, we propose a dynamically reshaping, high data-reuse PE array accelerator, namely <i>DyNNamic</i> . DyNNamic leverages kernel-wise filter decomposition to partition the convolution operation into two compact stages: Shared Kernels Convolution (SKC) and Weighted Accumulation (WA). Because both stages have vastly different dimensions, DyNNamic reshapes its PE array to effectively map the algorithm to the architecture. The architecture then exploits data-reuse opportunities created by the SKC stage, further reducing data movement with negligible overhead. We evaluate our approach on various representative networks and compare against state-of-the-art accelerators. On average, DyNNamic outperforms DianNao by <inline-formula><tex-math notation="LaTeX">$8.4\times$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$12.3\times$</tex-math></inline-formula> in terms of inference energy and latency, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

DyNNamic: Dynamically Reshaping, High Data-Reuse Accelerator for Compact DNNs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers

Lead the way for us

Journal: IEEE Transactions on Computers	Publication Date: Mar 1, 2023
Citations: 1

Similar Papers

Joint Optimization of DNN Partition and Continuous Task Scheduling for Digital Twin-Aided MEC Network With Deep Reinforcement Learning
Siyu Yuan ... Qin Li
IEEE Access | VOL. 11
Siyu Yuan, et. al.Siyu Yuan ... Qin Li
01 Jan 2023
IEEE Access | VOL. 11

Deep Reinforcement Learning Based Resource Management for DNN Inference in Industrial IoT
Weiting Zhang ... Hongke Zhang
IEEE Transactions on Vehicular Technology | VOL. 70
Weiting Zhang, et. al.Weiting Zhang ... Hongke Zhang
24 Mar 2021
IEEE Transactions on Vehicular Technology | VOL. 70

PMP: A partition-match parallel mechanism for DNN inference acceleration in cloud–edge collaborative environments
Zhuofan Liao ... Qiang Tang
Journal of Network and Computer Applications | VOL. 218
Zhuofan Liao, et. al.Zhuofan Liao ... Qiang Tang
18 Aug 2023
Journal of Network and Computer Applications | VOL. 218

Throughput Maximization of Delay-Aware DNN Inference in Edge Computing by Exploring DNN Model Partitioning and Inference Parallelism
Jing Li ... Weifa Liang
IEEE Transactions on Mobile Computing | VOL. 22
Jing Li, et. al.Jing Li ... Weifa Liang
01 May 2023
IEEE Transactions on Mobile Computing | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DyNNamic: Dynamically Reshaping, High Data-Reuse Accelerator for Compact DNNs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers