Abstract

To avoid a poor random write performance, flash-based solid state drives typically rely on an internal log-structure. This log-structure reduces the write amplification and thereby improves the write throughput and extends the drive’s lifespan. In this paper, we analyze the performance of the log-structure combined with the d-choices garbage collection algorithm, which repeatedly selects the block with the fewest number of valid pages out of a set of d randomly chosen blocks, and consider non-uniform random write workloads. Using a mean field model, we show that the write amplification worsens as the hot data gets hotter.Next, we introduce the double log-structure, which uses a separate log for internal and external write requests. Although the double log-structure performs identically to the single log-structure under uniform random writes, we show that it drastically reduces the write amplification of the d-choices algorithm in the presence of hot data. In other words, the double log-structure yields an automatic form of data separation. Further, with the double log-structure there exists an optimal value for d (typically around 10), meaning the greedy garbage collection algorithm is no longer optimal. Finally, both mean field models introduced in this paper are validated using simulation experiments.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.