Abstract

Learned Index, which utilizes effective machine learning models to accelerate locating sorted data positions, has gained increasing attention in many big data scenarios. Using efficient learned models, the learned indexes build large nodes and flat structures, thereby greatly improving the performance. However, most of the state-of-the-art learned indexes are designed for DRAM, and there is hence an urgent need to enable high-performance learned indexes for emerging Non-Volatile Memory (NVM). In this article, we first evaluate and analyze the performance of the existing learned indexes on NVM. We discover that these learned indexes encounter severe write amplification and write performance degradation due to the requirements of maintaining large sorted/semi-sorted data nodes. To tackle the problems, we propose a novel three-tiered architecture of write-optimized persistent learned index, which is named WIPE , by adopting unsorted fine-granularity data nodes to achieve high write performance on NVM. Thereinto, we devise a new root node construction algorithm to accelerate searching numerous small data nodes. The algorithm ensures stable flat structure and high read performance in large-size datasets by introducing an intermediate layer (i.e., index nodes) and achieving accurate prediction of index node positions from the root node. Our extensive experiments on Intel DCPMM show that WIPE can improve write throughput and read throughput by up to 3.9× and 7×, respectively, compared to the state-of-the-art learned indexes. Also, WIPE can recover from a system crash in ∼ 18 ms. WIPE is free as an open-source software package. 1

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call