Less Provisioning: A Hybrid Resource Scaling Engine for Long-Running Services With Tail Latency Guarantees

Binlei Cai,Laiping Zhao,Keqiu Li,Rongqi Zhang

doi:10.1109/tcc.2020.3016345

Abstract

Modern resource management frameworks guarantee low tail latency for long-running services using the resource over-provisioning method, resulting in serious waste of resources and increasing the service costs greatly. To reduce the over-provisioning cost, we present HRSE, a hybrid resource scaling engine that enables much more efficient resource provisioning for both periodic and non-periodic workloads of long-running services while guaranteeing the tail latency Service Level Objective (SLO). HRSE employs a convolution-based time series analysis to identify periodic patterns in workloads. If periodic patterns are discovered, HRSE estimates the just-right amount of resources based on the periodic features through a <i>top-</i> <inline-formula><tex-math notation="LaTeX">$K$</tex-math></inline-formula> based collaborative filtering approach. Otherwise, it leverages wavelet-clustering to capture the short-term patterns in non-periodic workloads and predict the resource demands for the near future. To further enforce the tail latency SLO, HRSE uses an online reprovisioning mechanism that dynamically adjusts the resources to mitigate the performance uncertainty due to workload burstinesses. We fully implement HRSE on top of Docker and conduct extensive experiments using traces from production systems. Testbed experiments show that HRSE is able to increase the average resource utilization to 43 and 45 percent for periodic and non-periodic workloads respectively while guaranteeing the same tail latency objective.

Full Text