GShare: A centralized GPU memory management framework to enable GPU memory sharing for containers

Munkyu Lee,Hyunho Ahn,Cheol-Ho Hong,Dimitrios S Nikolopoulos

doi:10.1016/j.future.2021.12.016

Munkyu Lee, Hyunho Ahn + Show 2 more

https://doi.org/10.1016/j.future.2021.12.016

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Owing to low overhead and rapid deployment, containers are increasingly becoming an attractive system software platform for deep learning and high performance computing (HPC) applications that leverage GPUs. Unfortunately, existing container software does not concern how each container allocates GPU memory. Therefore, if a certain container consumes the majority of GPU memory, other containers may not run their workloads because of insufficient memory. This paper presents gShare, a centralized GPU memory management framework to enable GPU memory sharing for containers. As with a modern operating system, gShare allocates the entire GPU memory inside the framework and manages the memory with sophisticated memory allocators. gShare is then able to enforce the GPU memory limit of each container by mediating the memory allocation calls. To achieve its objective, gShare introduces the API remoting components, the mediator, and the three-level memory allocator, which enable lightweight and efficient GPU memory management. Our prototype implementation achieves near-native performance with secure isolation and little memory waste in popular deep learning and HPC workloads.

Full Text