Analyzing Costs and Optimizations for an Elastic Key-Value Store on Amazon Web Services

David Chiu ,Travis Hall ,Farhana Kabir ,Apeksha Shetty ,Gagan Agrawal

doi:10.47164/ijngc.v2i2.109

Abstract

Cloud computing has emerged to provide virtual, pay-as-you-go computing and storage services over the Internet, where the usage cost directly depends on consumption. One compelling feature in Clouds is elasticity, where a user can demand, and be immediately given access to, more (or relinquish) resources based on requirements. However, this feature introduces new challenges in developing application and services. In this paper, we focus on the challenges of elastic data management in Cloud environments. Particularly, we consider an elastic key-value store, which is used to cache intermediate results in a service-oriented system, and accelerate future queries by reusing the stored values. Such a key-value store can clearly benefit from the elasticity offered by Clouds, by expanding the cache during query-intensive periods. However, supporting an elastic key-value store involves many challenges, including selecting an appropriate indexing scheme, data migration upon elastic resource provisioning, and optimizations to remove certain overheads in the Cloud. This paper focuses on the design of an elastic key-value store. We consider three ubiquitous methods for indexing: B + -Trees, Extendible Hashing, and Bloom Filters, and we show how these schemes can be modified to exploit elasticity in Clouds. We also evaluate various performance aspects associated with the use of these indexing schemes. Furthermore, we have developed a heuristic to request elastic compute resources for expanding the cache such that instance startup overheads are minimized in our scheme. Our evaluation studies show that the index selection depends on various application and system level parameters that we have identified. And while we confirm that B + -Trees, which pervade many of today’s key-value systems, would scale well, we show cases when Extendible Hashing would outperform B + -Trees. We also conduct an analysis which focuses on cost–performance tradeoffs of maintaining the cache. We have compared several Amazon Web Service (AWS Cloud) resources as possible cache placements and found that application dependent attributes such as unit-data size, total cache size, and persistence, have far reaching impli- cations on the cost of cache sustenance. Moreover, while instance-based caches expectedly yield higher cost, the performance that they afford may outweigh lower cost options.

Full Text