DeepZoning: Re-accelerate CNN Inference with Zoning Graph for Heterogeneous Edge Cluster

Jingyu Wang,Ruilong Ma,Xiang Yang,Qi Qi,Zirui Zhuang,Jing Wang,Jianxin Liao,Song Guo

doi:10.1145/3701995

Jingyu Wang, Ruilong Ma + Show 6 more

Open Access

https://doi.org/10.1145/3701995

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Parallelizing CNN inference on heterogeneous edge clusters with data parallelism has gained popularity as a way to meet real-time requirements without sacrificing model accuracy. However, existing algorithms struggle to find optimal parallel granularity for complex CNNS, the structure of which is a directed acyclic graph (DAG) rather than a chain, and the parallel dimension is inflexible. To distribute the workload of modern CNNs on heterogeneous devices is also proven as NP-hard problem. In this paper, we introduce DeepZoning , a versatile and cooperative inference framework that combines both model and data parallelism to accelerate CNN inference. DeepZoning employs two algorithms at different levels: (1) a low-level Adaptive Workload Partition algorithm that uses linear programming and takes spatial and channel dimensions into optimization during the search for feature map distribution on heterogeneous devices, and (2) a high-level Model Partition algorithm that finds the optimal model granularity and organizes complex CNNs into sequential zones to balance communication and computation during execution. Our experimental evaluations show that DeepZoning is effective, achieving up to a 3.02 × speed improvement on our experimental prototype compared to state-of-the-art algorithms.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

DeepZoning: Re-accelerate CNN Inference with Zoning Graph for Heterogeneous Edge Cluster

Abstract

Published Version

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization

Lead the way for us

Similar Papers

Brief Announcement: Accelerate CNN Inference with Zoning Graph at Dynamic Granularity
Ruilong Ma ... Jingyu Wang
-
Ruilong Ma, et. al.Ruilong Ma ... Jingyu Wang
17 Jun 2023
17 Jun 2023

Application of adaptive circuit partitioning algorithm to reduction of interconnections length between elements of VLSI circuit
W Szczesniak
-
W SzczesniakW Szczesniak
10 Dec 2002
10 Dec 2002

A Duplication Based Compile Time Scheduling Method for Task Parallelism
Sekhar Darbha ... Dharma P Agrawal
-
Sekhar Darbha, et. al.Sekhar Darbha ... Dharma P Agrawal
01 Jan 2001
01 Jan 2001

Optimal Number of Rake Combiners for Multiple Codes Assignment with Fast Handoff in UMTS Mobile Networks
Ben-Jye Chang
-
Ben-Jye Chang Ben-Jye Chang
05 Dec 2005
05 Dec 2005

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

DeepZoning: Re-accelerate CNN Inference with Zoning Graph for Heterogeneous Edge Cluster

Abstract

Published Version

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization