Abstract

Virtual block devices are heavily used to fulfill the block storage needs of hypervisor-based virtual machine (VM) instances through either local or remote storage spaces. However, a high degree of VM co-location makes it increasingly difficult to physically provision all the necessary block devices using only local storage space. Also, the local storage performance degrades rapidly as workloads interleave. On the other hand, when block devices are acquired through remote storage services, the aggregated network traffic may consume too much cluster-wide network bandwidth in a cloud data center. In order to solve these challenges, we propose a caching scheme for virtual block devices within the hypervisor. The scheme utilizes the physical node's finite local storage space as a block-level cache for the remote storage blocks to reduce the network traffic bound to the storage servers. This allows hypervisor-based compute nodes to serve the hosted VMs' I/O (Input/Output) requests from its local storage as much as possible while enabling VMs to exercise large storage space beyond the capacity of local disks for new virtual disks. Caching virtual disks at block-level in a cloud data center poses several challenges in maintaining high performance while adhering to the virtual disk semantics. We have realized the proposed scheme, called vStore, on Xen hypervisor nodes with factual assessment on its design effectiveness and implementation efficiency. Our comprehensive experimental evaluations show that the proposed scheme substantially reduces the network traffic (49% on average), and incurs less than 12% overheads on the storage I/O performance.

Highlights

  • Modern cloud service infrastructures have reached an unprecedented level of performance and scale spanning multiple geographies to meet rapidly growing demands of the user base [1]–[6]

  • We propose an approach to mitigating those issues by utilizing the virtual machine (VM)-hosting node’s local storage as a blocklevel cache for the remote network storage in use

  • In modern virtualized cloud infrastructure, remote storage system plays a critical role in delivering the scalability

Read more

Summary

Introduction

Modern cloud service infrastructures have reached an unprecedented level of performance and scale spanning multiple geographies to meet rapidly growing demands of the user base [1]–[6] Such a massive scale is due, in part, to the proliferation of data-driven application workloads in the areas of big data and deep learning. If data happens to be located in a separate network unit in the hierarchy, be it rack-level, pod-level, plane, region, or even data center level, the data has to travel over the network en masse repeatedly to where the training is conducted [7], [8]. Some form of data caching scheme needs to be employed in the storage system on which the computation occurs

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call