Adaptive partitioning and efficient scheduling for distributed DNN training in heterogeneous IoT environment

Binbin Huang,Xunqing Huang,Xiao Liu,Chuntao Ding,Yuyu Yin,Shuiguang Deng

doi:10.1016/j.comcom.2023.12.034

Abstract

With the increasing proliferation of Internet-of-Things (IoT) devices, it is a growing trend toward training a deep neural network (DNN) model in pipeline parallelism across resource-constraint IoT devices. To ensure the model convergence and accuracy, synchronous pipeline parallelism is usually adopted. However, the synchronous pipeline can incur a long waiting time due to its gradient aggregation of all microbatches. It is urgent for a DNN model to design an adaptive partitioning and efficient scheduling scheme in heterogeneous IoT environment. To address this problem, we propose a policy gradient based model partitioning and scheduling scheme (PG-MPSS) to minimize per-iteration training time. More specifically, we first design a double-network framework to divide and schedule a DNN model. Then, we adopt a policy gradient algorithm to update the double-network parameters, aiming at learning an optimal double-network model. We conduct extensive experiments to compare the DNN training time of the PG-MPSS scheme with that of Dynamic Programming (DP), Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Average&Greedy (AG) and Proximal Policy Optimization (PPO) five baseline algorithms under different experimental settings. The related experimental results demonstrate that the PG-MPSS scheme can greatly expedite synchronous pipeline training of a DNN model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adaptive partitioning and efficient scheduling for distributed DNN training in heterogeneous IoT environment

Abstract

Talk to us

Similar Papers

More From: Computer Communications

Lead the way for us

Similar Papers

Split computing: DNN inference partition with load balancing in IoT-edge platform for beyond 5G
Jyotirmoy Karjee ... Vanamala N Bhargav
Measurement: Sensors | VOL. 23
Jyotirmoy Karjee, et. al.Jyotirmoy Karjee ... Vanamala N Bhargav
18 Aug 2022
Measurement: Sensors | VOL. 23

Robustness analysis and experimental validation of a deep neural network for acoustic source imaging
Qing Li ... Yu Liu
Mechanical Systems and Signal Processing | VOL. 216
Qing Li, et. al.Qing Li ... Yu Liu
04 May 2024
Mechanical Systems and Signal Processing | VOL. 216

A comparative evaluation of deep convolutional neural network and deep neural network-based land use/land cover classifications of mining regions using fused multi-sensor satellite data
Ajay Kumar ... Amit Kumar Gorai
Advances in Space Research | VOL. 72
Ajay Kumar, et. al.Ajay Kumar ... Amit Kumar Gorai
04 Sep 2023
Advances in Space Research | VOL. 72

An Empirical Study of the Impact of Hyperparameter Tuning and Model Optimization on the Performance Properties of Deep Neural Networks
Lizhi Liao ... Weiyi Shang
ACM Transactions on Software Engineering and Methodology | VOL. 31
Lizhi Liao, et. al.Lizhi Liao ... Weiyi Shang
09 Apr 2022
ACM Transactions on Software Engineering and Methodology | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adaptive partitioning and efficient scheduling for distributed DNN training in heterogeneous IoT environment

Abstract

Talk to us

Similar Papers

More From: Computer Communications