When Tree Meets Hash: Reducing Random Reads for Index Structures on Persistent Memories

Ke Wang,Huanchen Zhang,Guanqun Yang,Mingyu Gao,Yiwei Li

doi:10.1145/3588959

Abstract

Indexing structures are widely used in modern data-processing applications to support high-performance queries, and there are a variety of recent designs specifically optimized for the newly available persistent memory (PM). The primary focus of previous PM indexes is on reducing the expensive PM writes for persisting data. However, we find that in tree-based PM indexes, because of the smaller performance gap between writes and random reads on real PM devices, the read-intensive tree traversal phase dominates the overall latency. This observation calls for further optimizations on existing indexing structures for PM. In this paper, we propose Extendible Radix Tree (ERT), an efficient indexing structure for PM that significantly reduces tree heights to minimize random reads, while still maintaining fast in-node search speed. The key idea is to use extendible hashing for each node in a radix tree. This design allows us to have a relatively large fanout of the radix tree to keep the tree height small, and also to realize constant-time lookups within a node. Using extendible hashing also allows for incremental node modification without excessive writes during inserts and updates. Range queries are efficiently and robustly handled by enforcing partial ordering among the keys in the hash table of each node without introducing more hash collisions. Our experiments on both synthetic and real-world data sets demonstrate that ERT achieves up to 2.65×, 4.41×, and 2.43× speedups for search, insert, and range queries over the respectively state-of-the-art PM index.

Full Text