A critical aspect of modern key-value stores is the interaction between compaction policy and filters. Aggressive compaction reduces the on-disk footprint of a key-value store and can improve query performance, but can reduce insertion throughput because it is I/O and CPU expensive. Filters can mitigate the query costs of lazy compaction, but only if they fit in RAM, limiting the scalability of queries with lazy compaction. And, with fast storage devices, the CPU costs of querying filters in a lazy compacting system can be significant. In this work, we present Mapped SplinterDB, a key-value store that achieves excellent insertion performance, query performance, space efficiency, and scalability by replacing filters with maplets, space-efficient data structures that act as lossy maps with false positives. Critically, we use quotient maplets, which can be merged and resized without access to the underlying data, enabling us to decouple compaction of the data from compaction of the quotient maplets. Thus Mapped SplinterDB can compact data lazily and quotient maplets aggressively, so that each level has multiple sorted runs of data but only one quotient maplet. Quotient maplets are so small that compacting them aggressively is still cheaper than compacting the (much larger) data lazily, so overall we get the insertion performance of a lazily compacted system. And, since there is only one quotient maplet to query on each level, we get the query performance of an aggressively compacted system. Furthermore, quotient maplets can accelerate queries even when they don't fit in RAM, improving scalability to huge datasets. We also show how to use quotient maplets to estimate when a compaction could resolve a high density of updates, enabling Mapped SplinterDB to perform targeted compactions for space recovery. In our benchmarks, Mapped SplinterDB matches the insertion performance of SplinterDB, a state-of-the-art lazily compacted system, and beats RocksDB, an aggressive compacting system, by up to 9×. On queries, Mapped SplinterDB outperforms SplinterDB and RocksDB by up to 89% and 83%, respectively, and scales gracefully to huge datasets. Mapped SplinterDB is able to dynamically trade update performance for space efficiency, resulting in space overheads on update-heavy workloads as low as 15-61%, whereas RocksDB had 80-117% and SplinterDB had up to 137% space overhead.
Read full abstract