Abstract
Distributed storage systems are known to be susceptible to long tails in response time. In modern online storage systems such as Bing, Facebook, and Amazon, the long tails of the service latency are of particular concern, with 99.9th percentile response times being orders of magnitude worse than the mean. As erasure codes emerge as a popular technique to achieve high data reliability in distributed storage while attaining space efficiency, taming tail latency still remains an open problem due to the lack of mathematical models for analyzing such systems. To this end, we propose a framework for quantifying and optimizing tail latency in erasure-coded storage systems. In particular, we derive upper bounds on tail latency in closed-form for arbitrary service time distribution and heterogeneous files. Based on the model, we formulate an optimization problem to jointly minimize weighted latency tail probability of all files over the placement of files on the servers, and the choice of servers to access the requested files. The non-convex problem is solved using an efficient, alternating optimization algorithm. Further, we mathematically quantify, in closed form, the tail index , i.e., the exponent at which latency tail probability diminishes to zero, of the service latency for arbitrary erasure-coded storage, by characterizing the asymptotic behavior of latency distribution tails. We further show that probabilistic scheduling-based algorithms are (asymptotically) optimal since they are able to achieve the exact tail index. Evaluation results show significant reduction of tail latency for erasure-coded storage systems with realistic workload. Based on the offline algorithm, an online version is developed and its superiority over the state-of-the-art algorithms, e.g., join-shortest-queue (JSQ), power-of-d [Pof(d))], least-load [LL(d)], is shown. Finally, a cloud storage system is implemented in a real cloud environment to show the superiority of our approach as compared to the considered baselines.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have