Abstract
Key-value (KV) stores based on multi-stage structures are widely deployed in the cloud to ingest massive amounts of easily searchable user data. However, current KV storage systems inevitably sacrifice at least one of the performance objectives, such as write, read, space efficiency etc., for the optimization of others. To understand the root cause of and ultimately remove such performance disparities among the representative existing KV stores, we analyze their enabling mechanisms and classify them into two models of data structures facilitating KV operations, namely, the multi-stage tree (MS-tree) as represented by LevelDB, and the multi-stage forest (MS-forest) as typified by the size-tiered compaction in Cassandra. We then build a KV store on a novel split MS-forest structure, called SifrDB, that achieves the lowest write amplification across all workload patterns and minimizes space reservation for the compaction. In addition, we design a highly efficient parallel search algorithm that fully exploits the access parallelism of modern flash-based storage devices to substantially boost the read performance. Evaluation results show that under both micro and YCSB benchmarks, SifrDB outperforms its closest competitors, i.e., the popular MS-forest implementations, making it a highly desirable choice for the modern large-dataset-driven KV stores.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.