Abstract

Reducing tail latency becomes increasingly important to improve the user-perceived service experience. User-facing latency-sensitive cloud applications typically contain multiple interactive tiers (e.g., Web, App, Database) running in different virtual machines (VMs) with complex interaction patterns. However, such interactions between VMs in different tiers are often neglected in previous VM consolidation methods, resulting in poor application performance. In this article, we study the consolidation of multi-tier interactive workloads from a new perspective of user-perceived tail latency. We propose a novel profiling-based consolidation methodology to satisfy tail latency requirements while reducing the number of used physical machines. To achieve such a goal, we first perform large-scale profiling experiments under various consolidation settings in a KVM virtualized private cluster to establish the empirical performance values. We consider two key factors that affect the tail latency of multi-tier workloads: <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">interference</i> with co-located VMs and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">interaction</i> between tiers. We model the consolidation of multi-tier workloads as an optimization problem with different objectives and constraints, and derive the consolidation schedule. We implement and evaluate the proposed models, as well as comparing with other methods (i.e., <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">without</i> profiling or <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">without</i> considering interaction influence). Extensive experimental results show that the proposed method is able to reduce up to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">5X</i> tail latency, compared with the method <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">without</i> profiling and up to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1.3X</i> tail latency, compared with the method <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">without</i> considering the interaction influence between different tiers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call