Affinity-aware resource provisioning for long-running applications in shared clusters

Clément Mommessin,Renyu Yang,Natalia V Shakhlevich,Xiaoyang Sun,Satish Kumar,Junqing Xiao,Jie Xu

doi:10.1016/j.jpdc.2023.02.011

Clément Mommessin, Renyu Yang + Show 5 more

Open Access

https://doi.org/10.1016/j.jpdc.2023.02.011

Copy DOI

Abstract

Resource provisioning plays a pivotal role in determining the right amount of infrastructure resource to run applications and reduce the monetary cost. A significant portion of production clusters is now dedicated to long-running applications (LRAs), which are typically in the form of microservices and executed in the order of hours or even months. It is therefore practically important to plan ahead the placement of LRAs in a shared cluster for the minimized number of compute nodes required by them. Existing works on LRA scheduling are often application-agnostic, without particularly addressing the constraining requirements imposed by LRAs, such as co-location affinity constraints and time-varying resource requirements. In this paper, we present an affinity-aware resource provisioning approach for deploying large-scale LRAs in a shared cluster subject to multiple constraints, with the objective of minimizing the number of compute nodes in use. We investigate a broad range of solution algorithms which fall into three main categories: Application-Centric, Node-Centric, and Multi-Node approaches, and tune them for typical large-scale real-world scenarios. Experimental studies driven by the Alibaba Tianchi dataset show that our algorithms can achieve competitive scheduling effectiveness and running time, as compared with the heuristics used by the latest work including Medea and LraSched. Best results are obtained by the Application-Centric algorithms, if the algorithm's running time is of primary concern, and by Multi-Node algorithms, if the solution quality is of primary concern.

Full Text