A snapshot of parallelism in distributed deep learning training

Hairol Romero-Sandí,Gabriel Núñez,Elvis Rojas

doi:10.29375/25392115.5054

Abstract

The accelerated development of applications related to artificial intelligence has generated the creation of increasingly complex neural network models with enormous amounts of parameters, currently reaching up to trillions of parameters. Therefore, it makes your training almost impossible without the parallelization of training. Parallelism applied with different approaches is the mechanism that has been used to solve the problem of training on a large scale. This paper presents a glimpse of the state of the art related to parallelism in deep learning training from multiple points of view. The topics of pipeline parallelism, hybrid parallelism, mixture-of-experts and auto-parallelism are addressed in this study, which currently play a leading role in scientific research related to this area. Finally, we develop a series of experiments with data parallelism and model parallelism. The objective is that the reader can observe the performance of two types of parallelism and understand more clearly the approach of each one.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A snapshot of parallelism in distributed deep learning training

Abstract

Talk to us

Similar Papers

More From: Revista Colombiana de Computación

Lead the way for us

Journal: Revista Colombiana de Computación	Publication Date: Jun 30, 2024
License type: CC BY-NC-SA 4.0

Similar Papers

A data parallel view on polyhedral process networks
Ana Balevic ... Bart Kienhuis
-
Ana Balevic, et. al.Ana Balevic ... Bart Kienhuis
27 Jun 2011
27 Jun 2011

Stability Analysis of Electronic Circuit based on Complex Neural Network Theory
Yandong Yu ... Yuge Yao
-
Yandong Yu, et. al.Yandong Yu ... Yuge Yao
01 Jun 2020
01 Jun 2020

An Efficient and Non-Intrusive GPU Scheduling Framework for Deep Learning Training Systems
Shaoqi Wang ... Thomas Woo
-
Shaoqi Wang, et. al.Shaoqi Wang ... Thomas Woo
01 Nov 2020
01 Nov 2020

Synchronization in general complex dynamical networks with coupling delays
Chunguang Li ... Guanrong Chen
Physica A: Statistical Mechanics and its Applications | VOL. 343
Chunguang Li, et. al.Chunguang Li ... Guanrong Chen
15 Jun 2004
Physica A: Statistical Mechanics and its Applications | VOL. 343

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A snapshot of parallelism in distributed deep learning training

Abstract

Talk to us

Similar Papers

More From: Revista Colombiana de Computación