Efficient Multi-GPU Memory Management for Deep Learning Acceleration

Youngrang Kim,Jaehwan Lee,Hyunseung Jei,Hongchan Roh,Jik-Soo Kim

doi:10.1109/fas-w.2018.00023

Abstract

In this paper, we propose a new optimized memory management scheme that can improve the overall GPU memory utilization in multi-GPU systems for deep learning application acceleration. We extend the Nvidia's vDNN concept (a hybrid utilization of GPU and CPU memories) in a multi-GPU environment by effectively addressing PCIe-bus contention problems. In addition, we designed and implemented an intelligent prefetching algorithm (from CPU memory to GPU) that can achieve the highest processing throughput while sustaining a large min-batch size. For evaluation, we have implemented our memory usage optimization scheme on Tensorflow, the well-known machine learning library from Google, and performed extensive experiments in a multi-GPU testbed. Our evaluation results show that the proposed scheme can increase the mini-batch size by up to 60%, and improve the training throughput by up to 46.6% in a multi-GPU system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Multi-GPU Memory Management for Deep Learning Acceleration

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Comprehensive techniques of multi-GPU memory optimization for deep learning acceleration
Youngrang Kim ... Hongchan Roh
Cluster Computing | VOL. 23
Youngrang Kim, et. al.Youngrang Kim ... Hongchan Roh
23 Aug 2019
Cluster Computing | VOL. 23

GPUdmm: A high-performance and memory-oblivious GPU architecture using dynamic memory management
Youngsok Kim ... Jaewon Lee
-
Youngsok Kim, et. al.Youngsok Kim ... Jaewon Lee
01 Feb 2014
01 Feb 2014

Scaling up MapReduce-based Big Data Processing on Multi-GPU systems
Hai Jiang ... Zhi Qiao
Cluster Computing | VOL. 18
Hai Jiang, et. al.Hai Jiang ... Zhi Qiao
22 Aug 2014
Cluster Computing | VOL. 18

PommDNN: Performance optimal GPU memory management for deep neural network training
Weiduo Chen ... Qiang Wang
Future Generation Computer Systems | VOL. 152
Weiduo Chen, et. al.Weiduo Chen ... Qiang Wang
01 Nov 2023
Future Generation Computer Systems | VOL. 152

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Multi-GPU Memory Management for Deep Learning Acceleration

Abstract

Talk to us

Similar Papers