Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs

Wei Zhang,Minyi Guo,Kaihua Fu,Quan Chen,Ningxin Zheng,Weihao Cui

doi:10.1109/tc.2021.3064352

Abstract

Datacenters use GPUs to provide the significant computing throughput required by emerging user-facing services. The diurnal user access pattern of user-facing services provides a strong incentive to co-located applications for better GPU utilization, and prior work has focused on enabling co-location on multicore processors and traditional non-preemptive GPUs. However, current GPUs are evolving towards spatial multitasking and introduce a new set of challenges to eliminate QoS violations. To address this open problem, we explore the underlying causes of QoS violation on spatial multitasking GPUs. In response to these causes, we propose C-Laius, a runtime system that carefully allocates the computation resource to co-located applications for maximizing the throughput of batch applications while guaranteeing the required QoS of user-facing services. C-Laius not only allows co-locating one user-facing application with multiple batch applications, but also supports the co-location of multiple user-facing applications with batch applications. In the case of a single co-located user-facing application, our evaluation on an Nvidia RTX 2080Ti GPU shows that C-Laius improves the utilization of spatial multitasking GPUs by 20.8 percent, while achieving the 99%-ile latency target for user-facing services. As to the case of multiple co-located user-facing applications, C-Laius ensures no violation of QoS while improving the accelerator utilization by 35.9 percent on average.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers

Lead the way for us

Journal: IEEE Transactions on Computers	Publication Date: Mar 8, 2021
Citations: 16

Similar Papers

Laius
Wei Zhang ... Weihao Cui
-
Wei Zhang, et. al.Wei Zhang ... Weihao Cui
26 Jun 2019
26 Jun 2019

Baymax
Quan Chen ... Hailong Yang
ACM SIGARCH Computer Architecture News | VOL. 44
Quan Chen, et. al.Quan Chen ... Hailong Yang
25 Mar 2016
ACM SIGARCH Computer Architecture News | VOL. 44

Baymax
Quan Chen ... Hailong Yang
-
Quan Chen, et. al.Quan Chen ... Hailong Yang
25 Mar 2016
25 Mar 2016

Baymax
Quan Chen ... Lingjia Tang
ACM SIGPLAN Notices | VOL. 51
Quan Chen, et. al.Quan Chen ... Lingjia Tang
25 Mar 2016
ACM SIGPLAN Notices | VOL. 51

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers