GPUvm: GPU Virtualization at the Hypervisor

Yusuke Suzuki,Shinpei Kato,Hiroshi Yamada,Kenji Kono

doi:10.1109/tc.2015.2506582

Abstract

Graphic processing units (GPUs) provide a massively-parallel computational power and encourage the use of general-purpose computing on GPUs (GPGPU). The distinguished design of <i>discrete GPUs</i> helps them to provide the high throughput, scalability, and energy efficiency needed for GPGPU applications. Despite the previous study on GPU virtualization, the tradeoffs between the virtualization approaches remain unclear, because of a lack of designs for or quantitative evaluations of the hypervisor-level virtualization for discrete GPUs. Shedding light on these tradeoffs and the technical requirements for the hypervisor-level virtualization would facilitate the development of an appropriate GPU virtualization solution. <italic/> <inline-formula><tex-math notation="LaTeX"> $\sf{GPUvm}$</tex-math> </inline-formula> <italic/> , which is an open architecture for hypervisor-level GPU virtualization with a particular emphasis on using the Xen hypervisor, is presented in this paper. <inline-formula><tex-math notation="LaTeX"> $\sf{GPUvm}$</tex-math> </inline-formula> offers three virtualization modes: the full-, naive para-, and high-performance para-virtualization. <inline-formula><tex-math notation="LaTeX">$\sf{GPUvm}$</tex-math></inline-formula> exposes low- and high-level interfaces such as memory-mapped I/O and DRM APIs to the guest virtual machines (VMs). Our experiments using a relevant commodity GPU showed that <inline-formula><tex-math notation="LaTeX">$\sf{GPUvm}$</tex-math></inline-formula> incurs different overheads as the level of the exposed interfaces is changed. The results also showed that a coarse-grained fairness on the GPU among multiple VMs can be achieved using GPU scheduling.

Full Text