Abstract

Data deduplication is widely deployed in solid-state drives (SSDs) to improve the storage space utilization and alleviate the endurance issue. However, deduplication increases the degree of fragmentation and the access contention, significantly hurting the system performance. First, the fragmentation at the storage medium level in SSD is closely related to the internal parallelism, and the degree of fragmentation increases as the degree of the parallelism decreases. Deduplication removes the duplicate parts of the write sequences, decreasing the read parallelism. Thus, the degree of fragmentation is increased, eventually degrading the read performance. Second, the uneven distribution of the highly referenced data increases the access contention. This increased access contention prolongs the queueing time, further degrading the system performance. Motivated by our observations, we propose an elastic data cache (EDC) to improve the I/O performance in the deduplicated SSD. EDC redesigns the built-in DRAM-based data cache, tracks the popular and highly-referenced data, and maintains them in the cache. To fulfill the novel data cache, EDC changes the request process. The read requests that access the fragments in the flash memory are mostly performed in the fast-speed data cache, alleviating the negative impact of fragmentation on the read performance. The reduced read accesses to the flash memory also ease the access contention, which improves the system performance significantly. Extensive experimental results validate the efficiency of EDC, showing that it effectively improves the read performance and the write performance by up to 79% and 85% on average, respectively, in the deduplicated SSD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call