Abstract

Scientific applications from many problem domains produce and/or access large volumes of data. To support these applications, designers of high-end computing (HEC) systems have greatly increased the capacity of storage systems in recent years. However, because hard disk drives (HDDs) are still the dominant storage device used in HEC storage systems, and because HDD performance has not improved as quickly as the capacity, it can be challenging to deploy a storage system that provides both extreme capacity and extreme performance at a reasonable cost. Solid State Drives (SSDs) are a promising high- bandwidth and low-latency alternative to HDDs for HEC storage systems, but they too have deficiencies: small capacity, limited write cycles, and high cost when compared to HDDs. Because of their complementary characteristics, storage system designers are beginning to consider heterogeneous storage system designs that include both HDDs and SSDs. However, managing the workload so as to take advantage of the strengths of each type of storage device while controlling overhead is a major challenge. In this study, we propose a novel approach for managing a heterogeneous storage system called the Working Set-based Reorganization Scheme (WS-ROS). With WS-ROS, applications write to both HDDs and SSDs using all the available storage system bandwidth. Later, a background process reorganizes the data so as to place the data most likely to be read on SSDs while relegating the data most likely to be written and the data not likely to be accessed onto the slower but higher-capacity HDDs. For our evaluation workloads, the WS-ROS approach provided a 3× to 10× performance improvement compared to a heterogeneous storage system without a working set-based data reorganization scheme, suggesting the value of lazy reorganization of data based on data access working sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call