Abstract
With the increasing proliferation of Internet-of-Things (IoT) devices, it is a growing trend toward training a deep neural network (DNN) model in pipeline parallelism across resource-constraint IoT devices. To ensure the model convergence and accuracy, synchronous pipeline parallelism is usually adopted. However, the synchronous pipeline can incur a long waiting time due to its gradient aggregation of all microbatches. It is urgent for a DNN model to design an adaptive partitioning and efficient scheduling scheme in heterogeneous IoT environment. To address this problem, we propose a policy gradient based model partitioning and scheduling scheme (PG-MPSS) to minimize per-iteration training time. More specifically, we first design a double-network framework to divide and schedule a DNN model. Then, we adopt a policy gradient algorithm to update the double-network parameters, aiming at learning an optimal double-network model. We conduct extensive experiments to compare the DNN training time of the PG-MPSS scheme with that of Dynamic Programming (DP), Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Average&Greedy (AG) and Proximal Policy Optimization (PPO) five baseline algorithms under different experimental settings. The related experimental results demonstrate that the PG-MPSS scheme can greatly expedite synchronous pipeline training of a DNN model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.