GDS-LC

Binbing Hou,Feng Chen

doi:10.1145/3149374

Abstract

Successfully integrating cloud storage as a primary storage layer in the I/O stack is highly challenging. This is essentially due to two inherent critical issues: the high and variant cloud I/O latency and the per-I/O pricing model of cloud storage. To minimize the associated latency and monetary cost with cloud I/Os, caching is a crucial technology, as it directly influences how frequently the client has to communicate with the cloud. Unfortunately, current cloud caching schemes are mostly designed to optimize miss reduction as the sole objective and only focus on improving system performance while ignoring the fact that various cache misses could have completely distinct effects in terms of latency and monetary cost. In this article, we present a cost-aware caching scheme, called GDS-LC , which is highly optimized for cloud storage caching. Different from traditional caching schemes that merely focus on improving cache hit ratios and the classic cost-aware schemes that can only achieve a single optimization target, GDS-LC offers a comprehensive cache design by considering not only the access locality but also the object size, associated latency, and price, aiming at enhancing the user experience with cloud storage from two aspects: access latency and monetary cost. To achieve this, GDS-LC virtually partitions the cache space into two regions: a high-priority latency-aware region and a low-priority price-aware region. Each region is managed by a cost-aware caching scheme, which is based on GreedyDual-Size (GDS) and designed for a cloud storage scenario by adopting clean-dirty differentiation and latency normalization. The GDS-LC framework is highly flexible, and we present a further enhanced algorithm, called GDS-LCF , by incorporating access frequency in caching decisions. We have built a prototype to emulate a typical cloud client cache and evaluate GDS-LC and GDS-LCF with Amazon Simple Storage Services (S3) in three different scenarios: local cloud, Internet cloud, and heterogeneous cloud. Our experimental results show that our caching schemes can effectively achieve both optimization goals: low access latency and low monetary cost. It is our hope that this work can inspire the community to reconsider the cache design in the cloud environment, especially for the purpose of integrating cloud storage into the current storage stack as a primary layer.

Full Text