Memory-Computing Decoupling: A DNN Multitasking Accelerator With Adaptive Data Arrangement

Chuxi Li,Shengbing Zhang,Xiaoya Fan,Miao Wang,Xiaoti Wu,Meng Zhang,Zhao Yang

doi:10.1109/tcad.2022.3197493

Abstract

Multiple deep neural networks (DNNs) are increasingly used in real-world intelligent applications, such as intelligent robotics and autonomous vehicles to collectively complete complicated tasks running on edge devices. Because each layer of the subtasks prefers a distinct dataflow due to the heterogeneity in shape and scale of the network layers, a variable dataflow approach on the DNN accelerators is urgently required. On DNN accelerators that enable multiple dataflows, however, we detect a dimension mismatch between parallel processing under the dataflow approach and linear data memory arrangement. When multiple DNN tasks share partial features or weights, the issue is further exacerbated. During processing, this mismatch causes a sluggish data supply from both off-chip and on-chip memory. Consequently, the overall throughput, performance, and energy efficiency suffer since DNN models are sensitive to data density. In this work, we reveal the mechanism behind this data dimension mismatch and present a series of metrics that quantify the influence on system performance. On this foundation, we offer a framework that tracks the data tensor dimension conversion and employs a flexible data arrangement over multi-DNN computation to adapt to dataflow variability. An accelerator architecture named data arrangement multi-DNN accelerator (DARMA) that features a data arrangement and distribution circuit and hierarchical memory for data dimension conversion is also presented. Since the mismatch is mitigated, the suggested accelerator outperforms current accelerators in terms of bandwidth and processing unit utilization. Through tests on VR/AR, MLperf, and other multitask applications, the evaluation results show that the proposed architecture provides both energy-efficiency and throughput improvements.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Memory-Computing Decoupling: A DNN Multitasking Accelerator With Adaptive Data Arrangement

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Nov 1, 2022
Citations: 1

Similar Papers

Model Reverse-Engineering Attack using Correlation Power Analysis against Systolic Array Based Neural Network Accelerator
Kota Yoshida ... Shunsuke Okura
-
Kota Yoshida, et. al.Kota Yoshida ... Shunsuke Okura
01 Oct 2020
01 Oct 2020

A Reconfigurable Deep Neural Network on Chip Design with Flexible Convolutional Operations
Kun-Chih Chen ... Yi-Sheng Liao
-
Kun-Chih Chen, et. al.Kun-Chih Chen ... Yi-Sheng Liao
02 Oct 2022
02 Oct 2022

Dynamic Mapping Mechanism to Compute DNN Models on a Resource-limited NoC Platform
Kun-Chih Jimmy Chen ... Cheng-Kang Tsai
-
Kun-Chih Jimmy Chen, et. al.Kun-Chih Jimmy Chen ... Cheng-Kang Tsai
19 Apr 2021
19 Apr 2021

Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators
Swagath Venkataramani ... Kailash Gopalakrishnan
-
Swagath Venkataramani, et. al.Swagath Venkataramani ... Kailash Gopalakrishnan
01 Nov 2019
01 Nov 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Memory-Computing Decoupling: A DNN Multitasking Accelerator With Adaptive Data Arrangement

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems