Abstract

Increased capacity of main memory has led to the rise of in-memory databases. With disk access eliminated, efficiency of index structures has become critical for performance in these systems. An ideal index structure should exhibit high performance for a wide variety of workloads, be scalable, and efficient in handling large data sets. Unfortunately, our evaluation shows that most state-of-the-art index structures fail to meet these three goals. For an index to be performant with large data sets, it should ideally have time complexity independent of the key set size. To ensure scalability, critical sections should be minimized and synchronization mechanisms carefully designed to reduce cache coherence traffic. Moreover, complex memory hierarchy in servers makes data placement and memory access patterns important for high performance across all workload types. In this paper, we present HydraList, a new concurrent, scalable, and high performance in-memory index structure for massive multi-core machines. The key insight behind our design of HydraList is that an index structure can be divided into two components (search and data layers) which can be updated independently leading to lower synchronization overhead. By isolating the search layer, we are able to replicate it across NUMA nodes and reduce cache misses and remote memory accesses. As a result, our evaluation shows that HydraList outperforms other index structures especially in a variety of workloads and key types.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call