Automatic Stream Identification to Improve Flash Endurance in Data Centers

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

The demand for high performance I/O in Storage-as-a-Service (SaaS) is increasing day by day. To address this demand, NAND Flash-based Solid-state Drives (SSDs) are commonly used in data centers as cache- or top-tiers in the storage rack ascribe to their superior performance compared to traditional hard disk drives (HDDs). Meanwhile, with the capital expenditure of SSDs declining and the storage capacity of SSDs increasing, all-flash data centers are evolving to serve cloud services better than SSD-HDD hybrid data centers. During this transition, the biggest challenge is how to reduce the Write Amplification Factor (WAF) as well as to improve the endurance of SSD since this device has a limited program/erase cycles. A specified case is that storing data with different lifetimes (i.e., I/O streams with similar temporal fetching patterns such as reaccess frequency) in one single SSD can cause high WAF, reduce the endurance, and downgrade the performance of SSDs. Motivated by this, multi-stream SSDs have been developed to enable data with a different lifetime to be stored in different SSD regions. The logic behind this is to reduce the internal movement of data—when garbage collection is triggered, there are high chances of having data blocks with either all the pages being invalid or valid. However, the limitation of this technology is that the system needs to manually assign the same streamID to data with a similar lifetime. Unfortunately, when data arrives, it is not known how important this data is and how long this data will stay unmodified. Moreover, according to our observation, with different definitions of a lifetime (i.e., different calculation formulas based on selected features previously exhibited by data, such as sequentiality, and frequency), streamID identification may have varying impacts on the final WAF of multi-stream SSDs. Thus, in this article, we first develop a portable and adaptable framework to study the impacts of different workload features and their combinations on write amplification. We then propose a feature-based stream identification approach, which automatically co-relates the measurable workload attributes (such as I/O size, I/O rate, and so on.) with high-level workload features (such as frequency, sequentiality, and so on.) and determines a right combination of workload features for assigning streamIDs . Finally, we develop an adaptable stream assignment technique to assign streamID for changing workloads dynamically. Our evaluation results show that our automation approach of stream detection and separation can effectively reduce the WAF by using appropriate features for stream assignment with minimal implementation overhead.

Similar Papers
  • Dissertation
  • 10.17760/d20323952
Enhancing efficiency and endurance of flash-based storage for big-data processing on cloud and data center infrastructures
  • May 10, 2021
  • Janki Sharadkumar Bhimani

Data is the fuel for analytics of all the emerging technologies of Internet-of-Things (IoT) and cloud computing. Data management plays a critical role in delivering real-world impact. Three major components of data management are data generation, data categorization, and data storage. It is challenging for any systems to efficiently manage data to achieve low latency, high throughput, and good endurance. However, for efficient data management, we need to have a good ecosystem with fine co-ordination among multiple facets of the system such as parallel computing, hierarchical caching and tiering of memory, and low Input/Output (I/O) latency storage devices. Thus, addressing bottlenecks at each layer is important for accelerating overall production-scale deployments. However, the existing data management solutions for previously used applications and Hard Disk Drive (HDD) systems are not suitable for evolving cloud infrastructures, big-data workloads, and flash storage systems. Therefore, in this dissertation, we focus on studying resource management for cloud and data centers at three layers: application consolidation, workload data management, and flash-based data storage. We mainly concentrate on investigating big-data processing and high-performance computing platforms. For our research, we develop and deploy cutting-edge technologies such as containerized virtualization with Docker, high-performance machine learning applications, scalable big-data infrastructures such as Spark, space-efficient probabilistic data structures such as Bloomfilters and modern flash-based technologies such as multi-stream and key-value Solid State Drives (SSDs). Firstly, as different applications have different behaviors and resource requirements, in this work, we analyze and compare the performance of applications running in the cloud with VMs and Docker containers. By using fast back-end storage, the performance benefits of a lightweight container platform can be leveraged with quick I/O response. Nevertheless, the performance of simultaneously executing multiple instances of same or different applications may vary significantly with the number of containers. The performance may also vary with the nature of applications because different applications can exhibit different I/O behavior on SSDs in terms of I/O types (read/write), I/O access pattern (random/sequential), and I/O size. Therefore, we investigate and analyze the performance characteristics of both homogeneous and heterogeneous mixtures of I/O intensive containerized applications, operating with high-performance NVMe SSDs and derive novel design guidelines for achieving an optimal and fair operation of both homogeneous and heterogeneous mixtures. As more and more data centers in cloud storages are now replacing traditional HDDs with enterprise SSDs. Both developers and users of these SSDs require thorough benchmarking to evaluate their performance impacts. I/O performance with synthetic workload or classic benchmark varies drastically from real I/O activities in the data center. Thus, we propose a new framework, called Pattern I/O generator (PatIO), to collectively capture the enterprise storage behavior that is prevailing across assorted user workloads and system configurations for different database server applications on flash-based storage. Secondly, we further develop a new Docker controller for scheduling workload containers of different types of applications. Our controller decides the optimal batches of simultaneously operating containers in order to minimize total execution time and maximize resource utilization. Meanwhile, our controller also strives to balance the throughput among all simultaneously running applications. For optimal operation of any application, it is important to have proper data caching or tiering of memory. Data temperature identification is an important issue of many fields like data caching and storage tiering in modern flash-based storage systems. Therefore, we propose a novel data temperature identification scheme that adopts bloom filters to efficiently capture both frequency and recency of data blocks and accurately identify the exact data temperature for each data block. Thirdly, the demand for high speed 'Storage-as-a-Service' (SaaS) is increasing day-by-day. SSDs are commonly used in higher tiers of the storage rack in data centers. Also, all-flash data centers are evolving to serve cloud services better. Although SSDs guaranty better performance when compared to HDDs, but SSDs endurance is still a matter of concern. Storing data with a different lifetime in an SSD can cause high write amplification and reduce the endurance and performance of SSDs. Recently, multi-stream SSDs have been developed to enable data with a different lifetime to be stored in different SSD regions and thus reduce write amplification. To efficiently use this new multi-streaming technology, it is important to choose appropriate workload features to assign the same streamID to data with similar lifetime. However, we found that streamID identification using different features may have varying impacts on the final write amplification of multi-stream SSDs. Therefore, we develop a portable and adaptable framework to study the impacts of different workload features and their combinations on write amplification. On the other hand, even with the fast performing SSD, I/O performance continues to be the bottleneck in high-performance computing (HPC) systems. One of the main reasons lies in the fact that the conventional SSDs are block-based devices, which require conversions of application data into blocks to store data. Such intermediate data conversions are time-consuming and impede utilizing the full performance of NVMe SSDs for parallel HPC applications. So, we propose a novel Key-Value based Storage infrastucture for Parallel Computing (KV-SiPC) - a new method for multi-threaded OpenMP applications to use NVMe Key-Value SSDs. KV-SiPC simplifies the application data management by removing intermediate layers from the I/O stack of operating system (OS) kernel. Specifically, we design a new HPC key-value API (Application Program Interface) to convert a file-based application directly to a key-value based multi-threaded application. Apart from traditional parallel compute threads of HPC, in KV-SiPC we also have additional parallel data threads which are managed on user-end at program layer. Such fine-grained control of parallel compute as well as parallel data processing results in better resource utilization. We further develop a key-value concurrency manager that integrates OpenMP pragmas with the key-value Linux kernel device driver (KDD) to maintain memory mapping and thread safety when running a multi-threaded application on KV-SSDs. Thus, in this research, we aim to develop insights for a wide range of opportunities in the enterprise by improving both performance and reliability in cloud and data center systems.

  • Dissertation
  • 10.23860/diss-yang-jing-2019
ACCELERATING DATA ACCESSING BY EXPLOITING FLASH MEMORY TECHNOLOGIES
  • Aug 13, 2019
  • Jing Yang

Flash memory based SSD (solid state Drive) receives a lot of attention recently. SSD is a semiconductor device which provides great advantages in terms of high-speed random reads, low power consumption, compact size, and shock resistance. Traditional storage systems and algorithms are designed for hard disk drive (HDD), they do not work well on SSD because of SSD's asymmetric read/write performances and unavoidable internal activities, such as garbage collection (GC). There is a great need to optimize current storage systems and algorithms to accelerate data access in SSD. This dissertation presents four methods to improve the performance of the storage system by exploiting the characteristics of SSD. GC is one of the critical overhead of any flash memory based SSD. GC slows down I/O performance and decreases endurance of SSD. This dissertation introduces two methods to minimize the negative impact of GC, “WARCIP: Write Amplification Reduction by Clustering I/O Pages" and “Thermo-GC: Reducing Write Amplification by Tagging Migrated Pages during Garbage Collection". WARCIP uses a clustering algorithm to minimize the rewrite interval variance of pages in a flash block. As a result, pages in a flash block tend to have a similar lifetime, minimizing valid page migrations during GC. The idea of Thermo-GC is to identify data's hotness during GC operations and group data that have similar lifetimes to the same block. Thermo-GC can minimize valid page movements and reduce GC cost through clustering valid pages based on their hotness. Experiment results show that both WARCIP and Thermo-GC can improve the performance of SSD and reduce data movements during GC, implying extended lifetimes of SSDs. SSD fits naturally as a cache between the system RAM and the hard disk drive due to its performance/cost characteristics. But traditional cache replacements are designed for the hard disk drive, which do not work well on SSD because of SSD's asymmetric read/write performances and wearing issues. In this dissertation we present a new cache management algorithm. The idea is not to cache data in SSD upon the first access. Instead, SSD caches when data are determined to be hot enough and warrant caching in the SSD. Data cached in the SSD is managed using an asymmetrical replacement policy for read/write by means of conservative promotion upon hits. The nonvolatile characteristic of SSD allows cached data persistent even after power failures or system crashes. So the system can benefit from a hot restart. Current researches

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/nas.2019.8834722
Thermo-GC: Reducing Write Amplification by Tagging Migrated Pages during Garbage Collection
  • Aug 1, 2019
  • Jing Yang + 1 more

Flash memory based solid-state drive (SSD) has been deployed in various systems because of its significant advantages over hard disk drive in terms of throughput and IOPS. One inherent operation that is necessary in SSD is garbage collection (GC), a procedure that selects an erasure candidate block and moves valid data on the selected candidate to another block. The performance of SSD is greatly influenced by GC. While existing studies have made advances in minimizing GC cost, few took advantages of the procedure of GC itself. As GC goes on, valid pages in an erasure candidate block tend to have similar lifetimes that can be exploited to minimize page’s movements. In this paper, we introduce Thermo-GC. The idea is to identify data’s hotness during GC operations and group data that have similar lifetimes to the same block. By clustering valid pages based on their hotness, Thermo-GC can minimize valid page movements and reduce GC cost. Experiment results show that Thermo-GC reduces data movements during GC by 78% and write amplification factor by 29.7% on average, implying extended lifetimes of SSDs.

  • Conference Article
  • Cite Count Icon 15
  • 10.1109/cloud.2018.00010
FIOS: Feature Based I/O Stream Identification for Improving Endurance of Multi-Stream SSDs
  • Jul 1, 2018
  • Janki Bhimani + 6 more

The demand for high speed 'Storage-as-a-Service' (SaaS) is increasing day-by-day. SSDs are commonly used in higher tiers of storage rack in data centers. Also, all flash data centers are evolving to better serve cloud services. Although SSDs guaranty better performance when compared to HDDs, but SSDs endurance is still a matter of concern. Storing data with different lifetime in an SSD can cause high write amplification and reduce the endurance and performance of SSDs. Recently, multi-stream SSDs have been developed to enable data with different lifetime to be stored in different SSD regions and thus reduce write amplification. To efficiently use this new multi-streaming technology, it is important to choose appropriate workload features to assign the same streamID to data with similar lifetime. However, we found that streamID identification using different features may have varying impacts on the final write amplification of multi-stream SSDs. Therefore, in this paper we develop a portable and adoptable framework to study the impacts of different workload features and their combinations on write amplification. We also introduce a new feature, named "coherency", to capture the friendship among write operations with respect to their update time. Finally, we propose a feature-based stream identification approach, which co-relates the measurable workload attributes (such as I/O size, I/O rate, etc.) with high level workload features (such as frequency, sequentiality etc.) and determines a good combination of workload features for assigning streamIDs. Our evaluation results show that our proposed approach can always reduce the Write Amplification Factor (WAF) by using appropriate features for stream assignment.

  • Research Article
  • 10.3390/app16020838
Mitigating Write Amplification via Stream-Aware Block-Level Buffering in Multi-Stream SSDs
  • Jan 14, 2026
  • Applied Sciences
  • Hyeonseob Kim + 1 more

Write amplification factor (WAF) is a critical performance and endurance bottleneck in flash-based solid-state drives (SSDs). Multi-streamed SSDs mitigate WAF by enabling logical data streams to be written separately, thereby improving the efficiency of garbage collection. However, despite the architectural potential of multi-streaming, prior research has largely overlooked the design of write buffer management schemes tailored to this model. In this paper, we propose a stream-aware block-level write buffer management technique that leverages both spatial and temporal locality to further reduce WAF. Although the write buffer operates at the granularity of pages, eviction is performed at the block level, where each block is composed exclusively of pages from the same stream. All pages and blocks are tracked using least recently used (LRU) lists at both global and per-stream levels. To avoid mixing data with disparate hotness and update frequencies, pages from the same stream are dynamically grouped into logical blocks based on their recency order. When space is exhausted, eviction is triggered by selecting a full block of pages from the cold region of the global LRU list. This strategy prevents premature eviction of hot pages and aligns physical block composition with logical stream boundaries. The proposed approach enhances WAF and garbage collection efficiency without requiring hardware modification or device-specific extensions. Experimental results confirm that our design delivers consistent performance and endurance improvements across diverse multi-streamed I/O workloads.

  • Conference Article
  • 10.1109/imw.2015.7150265
3X Faster Speed Solid-State Drive with a Write Order Based Garbage Collection Scheme
  • May 1, 2015
  • Chihiro Matsui + 4 more

Solid-state drives (SSDs) are over-taking hard disk drives (HDDs) as high-volume storage in enterprise servers and data centers. However, SSDs write performance is limited due to their inability to overwrite in-place and need for garbage collection. To reduce the garbage collection (GC) overhead, a logical block address (LBA) scrambler has been proposed. However, the LBA scrambler has two issues: (1) SSD performance decreases with a hot and random workload, and (2) the table size of the LBA scrambler may become upto 0.85% of the SSD capacity. In this work, a write order (WO) based GC scheme is proposed to solve the first issue. The number of valid pages in the NAND flash block, the write order and erase count of the block are considered for victim block selection during GC. One of the key advantages of the WO GC is that it does not require a clock inside the SSD, which will not operate if the SSD power is off. Further, to solve the second issue of the large table size, a Sector Bundling scheme is proposed. From the results, SSD performance is improved 3×, and the LBA scrambler table size is reduced 16%.

  • Research Article
  • 10.3390/electronics12092142
On-Demand Garbage Collection Algorithm with Prioritized Victim Blocks for SSDs
  • May 7, 2023
  • Electronics
  • Hyeyun Lee + 2 more

Because of their numerous benefits, solid-state drives (SSDs) are increasingly being used in a wide range of applications, including data centers, cloud computing, and high-performance computing. The growing demand for SSDs has led to a continuous improvement in their technology and a reduction in their cost, making them a more accessible storage solution for a wide range of users. Garbage collection (GC) is a process that reclaims wasted storage space in NAND flash memories, which are used as the memory devices for SSDs. However, the GC process can cause performance degradation and lifetime reduction. This paper proposes an efficient garbage collection (GC) scheme that minimizes overhead by invoking GC operations only when necessary. Each GC operation is executed in a specific order based on the expected storage gain and the execution cost, ensuring that the storage space requirement is met while minimizing the frequency of GC invocation. This approach not only reduces the overhead due to GC, but also improves the overall performance of SSDs, including the latency and write amplification factor (WAF) which is an important indicator of the longevity of SSDs.

  • Research Article
  • Cite Count Icon 8
  • 10.1016/j.jpdc.2022.02.006
Lifespan-based garbage collection to improve SSD's reliability and performance
  • Feb 22, 2022
  • Journal of Parallel and Distributed Computing
  • Wen Cheng + 4 more

Lifespan-based garbage collection to improve SSD's reliability and performance

  • Research Article
  • Cite Count Icon 20
  • 10.1109/tdsc.2021.3131571
Lifespan and Failures of SSDs and HDDs: Similarities, Differences, and Prediction Models
  • Jan 1, 2023
  • IEEE Transactions on Dependable and Secure Computing
  • Riccardo Pinciroli + 3 more

Data center downtime typically centers around IT equipment failure. Storage devices are the most frequently failing components in data centers. We present a comparative study of hard disk drives (HDDs) and solid state drives (SSDs) that constitute the typical storage in data centers. Using six-year field data of 100,000 HDDs of different models from the same manufacturer from the Backblaze dataset and six-year field data of 30,000 SSDs of three models from a Google data center, we characterize the workload conditions that lead to failures. We illustrate that their root failure causes differ from common expectations and that they remain difficult to discern. For the case of HDDs we observe that young and old drives do not present many differences in their failures. Instead, failures may be distinguished by discriminating drives based on the time spent for head positioning. For SSDs, we observe high levels of infant mortality and characterize the differences between infant and non-infant failures. We develop several machine learning failure prediction models that are shown to be surprisingly accurate, achieving high recall and low false positive rates. These models are used beyond simple prediction as they aid us to untangle the complex interaction of workload characteristics that lead to failures and identify failure root causes from monitored symptoms.

  • Research Article
  • Cite Count Icon 17
  • 10.1007/s10586-015-0421-4
An empirical study of redundant array of independent solid-state drives (RAIS)
  • Jan 31, 2015
  • Cluster Computing
  • Youngjae Kim

Solid-state drives (SSD) are popular storage media devices alongside magnetic hard disk drives (HDD). SSD flash chips are packaged in HDD form factors and SSDs are compatible with regular HDD device drivers and I/O buses. This compatibility allows easy replacement of individual HDDs with SSDs in existing storage systems. However, under certain circumstances, SSD write performance can be significantly slowed by garbage collection (GC) processes. The frequency of GC activity is directly correlated with the frequency of inside-SSD write operations and the amount of data written to it. GC scheduling is locally controlled by an internal SSD logic. This paper studies the feasibility of Redundant Arrays of Independent Flash-based Solid-state drives (RAIS). We empirically analyze the RAIS performance using commercially-off-the-shelf (COTS) SSDs. We investigate the performance of various RAIS configurations under a variety of I/O access patterns. Finally, we present our performance and cost comparisons of RAIS with a fast, PCIe-based COTS SSD, in terms of performance and cost.

  • Research Article
  • Cite Count Icon 22
  • 10.1016/j.jpdc.2020.10.007
An empirical study of I/O separation for burst buffers in HPC systems
  • Nov 1, 2020
  • Journal of Parallel and Distributed Computing
  • Donghun Koo + 9 more

An empirical study of I/O separation for burst buffers in HPC systems

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-030-05063-4_48
H$$^{2}$$-RAID: A Novel Hybrid RAID Architecture Towards High Reliability
  • Jan 1, 2018
  • Tianyu Wang + 6 more

With the rapid development of storage technology, Solid State Drive (SSD) has received extensive attentions from industry and academia. As a promising alternative of the conventional Hard Disk Drive (HDD), SSD shows its advantages in terms of I/O performance, power consumption and shock resistance. But the natural constraint of write endurance limits the use of SSDs in large-scale storage systems, especially for scenarios with high reliability equirements. The Redundant Arrays of Independent Disks (RAID) technology provides a mechanism of device-level fault tolerance. To guarantee the performance, current RAID strategies usually evenly distributes the I/O requests to all disks. However, different from HDD, the bit error rate (BER) of SSD increases dramatically when it gets older. Therefore, simply introducing RAID technology into SSD array would result in the “correlated SSD failure” problem, that is, all the SSDs in array wear out at approximately the same time, seriously affecting the reliability of the array. In this paper, we propose a Hybrid High reliability RAID architecture named H\(^{2}\)-RAID, which combines SSDs with HDDs to achieve the high-performance of SSDs and the high-reliability of HDDs. To minimize the performance degradation caused by the low-performance HDDs, we design an HDD-aware backup strategy to coalesce the small writes requests. We implement the proposed strategy on the simulator based on Disksim. The experimental results show that we reduce the probability of data loss from 11.31% to 0.02% with only 5% performance loss, in average.

  • Research Article
  • 10.5573/ieie.2015.52.9.054
SSD 수명 관점에서 리눅스 I/O 스택에 대한 실험적 분석
  • Sep 25, 2015
  • Journal of the Institute of Electronics and Information Engineers
  • Nam Ki Jeong + 1 more

낸드 플래시 기반의 SSD (Solid-State Drive)는 HDD (Hard Disk Drive) 대비 월등한 성능에도 불구하고 쓰기 회수 제한이라는 태생적 단점을 가지고 있다. 이로 인해 SSD의 수명은 워크로드에 의해 결정되어 SSD의 기술 변화 추세인 SLC (Single Level Cell) 에서 MLC (Multi Level Cell) 로의 전환, MLC에서 TLC (Triple Level Cell) 로의 전환에 있어 큰 도전이 될 수 있다. 기존 연구들은 주로 wear-leveling 또는 하드웨어 아키텍처 측면에서 SSD의 수명 개선을 다루었으나, 본 논문에서는 호스트가 요청한 쓰기에 대해 SSD가 낸드플래시 메모리를 통해 처리하는 수명관점의 효율성을 대변하는 WAF (Write Amplification Factor) 관점에서 Host I/O 스택 중 파일 시스템, I/O 스케줄러, 링크 전력에 대해 JEDEC 엔터프라이즈 워크로드를 이용해 I/O 스택 최적 구성에 대해 실험적 분석을 수행하였다. WAF는 SSD의 FTL의 효율성을 측정하는 지표로 수명관점에서 가장 객관적으로 사용한다. I/O 스택에 대한 수명 관점의 최적 구성은 MinPower-Dead-XFS로 최대 성능 조합인 MaxPower-Cfq-Ext4에 비해 성능은 13% 감소하였지만 수명은 2.6 배 연장됨을 확인하였다. 이는 I/O 스택의 최적화 구성에 있어, SSD 성능 관점뿐만 아니라 수명 관점의 고려에 대한 유의미를 입증한다.

  • Conference Article
  • Cite Count Icon 60
  • 10.1109/msst.2011.5937224
Harmonia: A globally coordinated garbage collector for arrays of Solid-State Drives
  • May 1, 2011
  • Youngjae Kim + 5 more

Solid-State Drives (SSDs) offer significant performance improvements over hard disk drives (HDD) on a number of workloads. The frequency of garbage collection (GC) activity is directly correlated with the pattern, frequency, and volume of write requests, and scheduling of GC is controlled by logic internal to the SSD. SSDs can exhibit significant performance degradations when garbage collection (GC) conflicts with an ongoing I/O request stream. When using SSDs in a RAID array, the lack of coordination of the local GC processes amplifies these performance degradations. No RAID controller or SSD available today has the technology to overcome this limitation. This paper presents Harmonia, a Global Garbage Collection (GGC) mechanism to improve response times and reduce performance variability for a RAID array of SSDs. Our proposal includes a high-level design of SSD-aware RAID controller and GGC-capable SSD devices, as well as algorithms to coordinate the global GC cycles. Our simulations show that this design improves response time and reduces performance variability for a wide variety of enterprise workloads. For bursty, write dominant workloads response time was improved by 69% while performance variability was reduced by 71%.

  • Conference Article
  • 10.1109/icnidc.2014.7000345
Exploiting the uFLIP benchmark for analyzing SSDs performance
  • Sep 1, 2014
  • Yoonsuk Kang + 4 more

Solid state drives (SSDs) have higher bandwidth and lower access latency than hard disk drives (HDDs). Therefore, SSDs have been rapidly replacing HDDs. Many SSD manufactures produce their own SSDs. However, they do not open their SSD architecture and control firmware. As a result it is necessary to compare and analyze the performance of different kind of SSDs. A benchmark is used to analyze the storage device's performance. However, most benchmarks are targeted at HDDs. In this paper, we exploit the uFLIP benchmark which reflects flash device characteristics to analyze the SSD performance. Through this process, we have identified common SSD characteristics. Also, we have confirmed that a particular SSD shows strength at some situations.

Save Icon
Up Arrow
Open/Close