Abstract

Studies show that non-graphics programs can be less optimized for the GPU hardware, leading to significant resource under-utilization. Sharing the GPU among multiple programs can effectively improve utilization, which is particularly attractive to systems (e.g., cloud computing) where many applications require access to the GPU. However, current GPUs lack proper architecture features to support sharing. Initial attempts are very preliminary in that they either provide only static sharing, which requires recompilation or code transformation, or they do not effectively improve GPU resource utilization. We propose Simultaneous Multikernel (SMK), a fine-grained dynamic sharing mechanism, that fully utilizes resources within a streaming multiprocessor by exploiting heterogeneity of different kernels. We extend the GPU hardware to support SMK, and propose several resource allocation strategies to improve system throughput while maintaining fairness. Our evaluation of 45 shared workloads shows that SMK improves GPU throughput by 34 percent over non-shared execution and 10 percent over a state-of-the-art design.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call