Abstract
Commercial search engines generally maintain hundreds of thousands of machines equipped with large sized DRAM which incurs high hardware cost since DRAM is expensive. Recently, NVM Optane SSD has been considered as a promising underlying storage device due to its price advantage and speed advantage. However, to achieve a comparable efficiency performance with in-memory index, applying NVM to both latency and I/O bandwidth critical applications still face non-trivial challenges, because NVM has much lower I/O speed and bandwidth compared to DRAM. In this paper, we propose an NVM SSD-optimized query processing framework, aiming to address both the latency and bandwidth issues of using NVM in search engines. First, we propose a pipelined query processing methodology which significantly reduces the I/O waiting time. Second, we propose a cache-aware query reordering algorithm which enables queries sharing more data to be processed adjacently. Third, we propose a data prefetching mechanism which reduces the extra thread waiting time and improves bandwidth utilization. Moreover, we propose intra-query parallel mechanisms for long-tail queries, including query subtask scheduling, heap concurrent access strategy, query parallelism prediction and adaptive pipelining. Extensive experimental studies show that our framework significantly outperforms the state-of-the-art baselines, which obtains comparable processing latency and throughput with DRAM in both inter-query and intra-query parallel scenarios.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have