A Hybrid Parallelization Approach for Distributed and Scalable Deep Learning

Samson Akintoye,Haoming Chen,Dr Xin Zhang,Prof Liangxiu Han,Prof Daoqiang Zhang

doi:10.2139/ssrn.4043672

Abstract

Recently, Deep Neural Networks (DNNs) have recorded great success in handling medical and other complex classification tasks. However, as the sizes of a DNN model and the available dataset increase, the training process becomes more complex and computationally intensive, which usually takes a longer time to complete. In this work, we have proposed a generic full end-to-end hybrid parallelization approach combining both model and data parallelism for efficiently distributed and scalable training of DNN models. We have also proposed a Genetic Algorithm based heuristic resources allocation mechanism (GABRA) for optimal distribution of partitions on the available GPUs for computing performance optimization. We have applied our proposed approach to a real use case based on 3D Residual Attention Deep Neural Network (3D-ResAttNet) for efficient Alzheimer Disease (AD) diagnosis on multiple GPUs. The experimental evaluation shows that the proposed approach is efficient and scalable, which achieves almost linear speedup with little or no differences in accuracy performance when compared with the existing non-parallel DNN models.

Highlights

In recent time, Deep Neural Networks (DNNs) have gained popularity as an important tool for solving complex tasks ranging from image classification [1], speech recognition [2], medical diagnosis [3, 4], to the recommendation systems [5] and complex games [7, 6]
The above-aforementioned approaches adopted data, model and pipeline parallelization separately or the combination of the methods to improve the performance of DNN models training
We conducted the experiments on 3D-ResAttNet model for two classification tasks: Stable mild cognitive impairment (MCI) (sMCI) vs. pMCI and Alzheimer’s Disease (AD) vs. Normal cohort (NC)

Summary

Introduction

Deep Neural Networks (DNNs) have gained popularity as an important tool for solving complex tasks ranging from image classification [1], speech recognition [2], medical diagnosis [3, 4], to the recommendation systems [5] and complex games [7, 6]. Training a DNN model requires a large volume of data, which is both data and computational intensive, leading to increased training time. To overcome this challenge, various parallel and distributed computing methods [8] have been proposed to scale up the DNN models to provide timely and efficient learning solutions. Various parallel and distributed computing methods [8] have been proposed to scale up the DNN models to provide timely and efficient learning solutions It can be divided into data parallelism, model parallelism, pipeline parallelism and hybrid parallelism (a combination of data and model parallelism).

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: SSRN Electronic Journal	Publication Date: Jan 1, 2022
Citations: 3	License type: other-oa

R Discovery Prime

R Discovery Prime

A Hybrid Parallelization Approach for Distributed and Scalable Deep Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: SSRN Electronic Journal

Lead the way for us

Similar Papers

A Hybrid Parallelization Approach for Distributed and Scalable Deep Learning
Samson B Akintoye ... Xin Zhang
IEEE Access | VOL. 10
Samson B Akintoye, et. al.Samson B Akintoye ... Xin Zhang
01 Jan 2021
IEEE Access | VOL. 10

A comparative evaluation of deep convolutional neural network and deep neural network-based land use/land cover classifications of mining regions using fused multi-sensor satellite data
Ajay Kumar ... Amit Kumar Gorai
Advances in Space Research | VOL. 72
Ajay Kumar, et. al.Ajay Kumar ... Amit Kumar Gorai
04 Sep 2023
Advances in Space Research | VOL. 72

A novel robust black-box fingerprinting scheme for deep classification neural networks
Mouke Mo ... Xinpeng Zhang
Expert Systems With Applications | VOL. 252
Mouke Mo, et. al.Mouke Mo ... Xinpeng Zhang
14 May 2024
Expert Systems With Applications | VOL. 252

Optimizing makespan and resource utilization for multi-DNN training in GPU cluster
Zhongjin Li ... Francesco Piccialli
Future Generation Computer Systems | VOL. 125
Zhongjin Li, et. al.Zhongjin Li ... Francesco Piccialli
24 Jun 2021
Future Generation Computer Systems | VOL. 125

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Hybrid Parallelization Approach for Distributed and Scalable Deep Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: SSRN Electronic Journal