MoDNN: Memory optimal DNN training on GPUs

Xiaoming Chen,Danny Z Chen,Xiaobo Sharon Hu

doi:10.23919/date.2018.8341972

Abstract

Graphics processing units (GPUs) are widely adopted to accelerate the training of deep neural networks (DNNs). However, the limited GPU memory size restricts the maximum scale of DNNs that can be trained on GPUs, which presents serious challenges. This paper proposes an moDNN framework to optimize the memory usage in DNN training. moDNN supports automatic tuning of DNN training code to match any given memory budget (not smaller than the theoretical lower bound). By taking full advantage of overlapping computations and data transfers, we have developed heuristics to judiciously schedule data offloading and prefetching, together with training algorithm selection, to optimize the memory usage. We further introduce a new sub-batch size selection method which also greatly reduces the memory usage. moDNN can save the memory usage up to 50 χ, compared with the ideal case which assumes that the GPU memory is sufficient to hold all data. When executing moDNN on a GPU with 12GB memory, the performance loss is only 8%, which is much lower than that caused by the best known existing approach, vDNN. moDNN is also applicable to multiple GPUs and attains 1.84 χ average speedup on two GPUs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MoDNN: Memory optimal DNN training on GPUs

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

MoDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units
Xiaoming Chen ... Danny Ziyi Chen
IEEE Transactions on Parallel and Distributed Systems | VOL. 30
Xiaoming Chen, et. al.Xiaoming Chen ... Danny Ziyi Chen
01 Mar 2019
IEEE Transactions on Parallel and Distributed Systems | VOL. 30

Dynamic Memory Management for GPU-Based Training of Deep Neural Networks
Shriram S.B ... Anshuj Garg
-
Shriram S.B, et. al.Shriram S.B ... Anshuj Garg
01 May 2019
01 May 2019

AccDP: Accelerated Data-Parallel Distributed DNN Training for Modern GPU-Based HPC Clusters
Nawras Alnaasan ... Hari Subramoni
-
Nawras Alnaasan, et. al.Nawras Alnaasan ... Hari Subramoni
01 Dec 2022
01 Dec 2022

Acceleration of Large Deep Learning Training with Hybrid GPU Memory Management of Swapping and Re-computing
Haruki Imai ... Tung D Le
-
Haruki Imai, et. al.Haruki Imai ... Tung D Le
10 Dec 2020
10 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MoDNN: Memory optimal DNN training on GPUs

Abstract

Talk to us

Similar Papers