Simultaneous Multikernel: Fine-Grained Sharing of GPUs

Zhenning Wang,Jun Yang,Bruce Childers,Youtao Zhang,Minyi Guo,Rami Melhem

doi:10.1109/lca.2015.2477405

Zhenning Wang, Jun Yang + Show 4 more

Open Access

https://doi.org/10.1109/lca.2015.2477405

Copy DOI

Abstract

Studies show that non-graphics programs can be less optimized for the GPU hardware, leading to significant resource under-utilization. Sharing the GPU among multiple programs can effectively improve utilization, which is particularly attractive to systems (e.g., cloud computing) where many applications require access to the GPU. However, current GPUs lack proper architecture features to support sharing. Initial attempts are very preliminary in that they either provide only static sharing, which requires recompilation or code transformation, or they do not effectively improve GPU resource utilization. We propose Simultaneous Multikernel (SMK), a fine-grained dynamic sharing mechanism, that fully utilizes resources within a streaming multiprocessor by exploiting heterogeneity of different kernels. We extend the GPU hardware to support SMK, and propose several resource allocation strategies to improve system throughput while maintaining fairness. Our evaluation of 45 shared workloads shows that SMK improves GPU throughput by 34 percent over non-shared execution and 10 percent over a state-of-the-art design.

Full Text