Alleviating I/O Interference in Virtualized Systems With VM-Aware Persistency Control

Minho Lee,Taehyung Lee,Young Ik Eom

doi:10.1109/access.2021.3090865

Abstract

Consolidating multiple servers into a physical machine is now a commonplace in cloud infrastructures. The virtualized systems often arrange virtual disks of multiple virtual machines (VMs) on the same underlying storage device while striving to guarantee the service level objective (SLO) of the performance of each VM. Unfortunately, sync operations called by a VM may make it hard to satisfy the performance SLO by disturbing I/O activities of other VMs. In this paper, we experimentally uncover that the disk cache flush operation incurs significant I/O interference among VMs, and revisit the internal architecture and flush mechanism of the flash memory-based SSD. Then, we present vFLUSH, a novel VM-aware flush mechanism, that supports VM-based persistency control for the disk cache flush operation. We also discuss the long-tail latency issue in vFLUSH and an efficient scheme for mitigating the problem. Our evaluation with various micro- and macro-benchmarks shows that vFLUSH reduces the average latency of disk cache flush operations by up to 58.5%, thereby producing improvements in throughput by up to 1.93×. The method for alleviating the long-tail latency problem, which is applied to vFLUSH, achieves a significant reduction in tail latency by up to 75.9%, with a modest throughput degradation by 2.9-7.2%.

Highlights

The modern cloud infrastructures leverage the benefits of virtualization technology for server consolidation and performance isolation. [1], [2]
We focus on address translation and write buffering which are closely related to the disk cache flush operations
We examine how the fsync-invoking virtual machines (VMs) significantly degrades the throughput of the other VMs by delaying their requests until its disk cache flush operations are completed, and for similar reasons, there

Summary

Introduction

The modern cloud infrastructures leverage the benefits of virtualization technology for server consolidation and performance isolation. [1], [2]. Out of many I/O features, including the type and pattern of I/O requests, a representative factor that negatively affects the I/O performance of the system is the sync operation (e.g., fsync or fdatasync) Though this operation is widely utilized by many cloud applications to prevent unintended data reordering and provide immediate durability, it incurs severe performance degradation. BRIEF OVERVIEW OF SSD Unlike traditional HDDs, SSDs have some vital software functionalities for managing their complex hardware components, such as channel, die, plane, block, and page Among such functionalities, flash translation layer (FTL) is the core functionality to complement the out-of-place update feature of flash chips and to optimize the reliability and performance of SSDs. FTL is mainly organized with the following four components: address translation, write buffering, garbage collection, and wear-leveling. In the remainder of this paper, we refer to this internal DRAM of the SSD as DRAM buffer

Methods

Results

Conclusion