Component-distinguishable Co-location and Resource Reclamation for High-throughput Computing

Laiping Zhao,Yushuai Cui,Tie Qiu,Yungang Bao,Yanan Yang,Xiaobo Zhou,Keqiu Li

doi:10.1145/3630006

Abstract

Cloud service providers improve resource utilization by co-locating latency-critical (LC) workloads with best-effort batch (BE) jobs in datacenters. However, they usually treat multi-component LCs as monolithic applications and treat BEs as “second-class citizens” when allocating resources to them. Neglecting the inconsistent interference tolerance abilities of LC components and the inconsistent preemption loss of BE workloads can result in missed co-location opportunities for higher throughput. We present Rhythm , a co-location controller that deploys workloads and reclaims resources rhythmically for maximizing the system throughput while guaranteeing LC service’s tail latency requirement. The key idea is to differentiate the BE throughput launched with each LC component, that is, components with higher interference tolerance can be deployed together with more BE jobs. It also assigns different reclamation priority values to BEs by evaluating their preemption losses into a multi-level reclamation queue. We implement and evaluate Rhythm using workloads in the form of containerized processes and microservices. Experimental results show that it can improve the system throughput by 47.3%, CPU utilization by 38.6%, and memory bandwidth utilization by 45.4% while guaranteeing the tail latency requirement.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Component-distinguishable Co-location and Resource Reclamation for High-throughput Computing

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Computer Systems

Lead the way for us

Similar Papers

Rhythm
Laiping Zhao ... Yungang Bao
-
Laiping Zhao, et. al.Laiping Zhao ... Yungang Bao
15 Apr 2020
15 Apr 2020

Improving spark application throughput via memory aware task co-location
Vicent Sanz Marco ... Barry Porter
-
Vicent Sanz Marco, et. al.Vicent Sanz Marco ... Barry Porter
11 Dec 2017
11 Dec 2017

Characteristics of Co-Allocated Online Services and Batch Jobs in Internet Data Centers: A Case Study From Alibaba Cloud
Congfeng Jiang ... Jian Wan
IEEE Access | VOL. 7
Congfeng Jiang, et. al.Congfeng Jiang ... Jian Wan
01 Jan 2019
IEEE Access | VOL. 7

A Strategy of Cloud Resource Load Balancing Enhancement Based on Ant Colony Optimization
Na Tang ... Haitao Zhang
-
Na Tang, et. al.Na Tang ... Haitao Zhang
03 Jul 2020
03 Jul 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Component-distinguishable Co-location and Resource Reclamation for High-throughput Computing

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Computer Systems