An NVM SSD-based High Performance Query Processing Framework for Search Engines

Xinyu Liu,Xao-Guang Liu,Gang Wang,Yu Pan,Yusen Li

doi:10.1109/tkde.2022.3160557

Xinyu Liu, Xao-Guang Liu + Show 3 more

https://doi.org/10.1109/tkde.2022.3160557

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Commercial search engines generally maintain hundreds of thousands of machines equipped with large sized DRAM which incurs high hardware cost since DRAM is expensive. Recently, NVM Optane SSD has been considered as a promising underlying storage device due to its price advantage and speed advantage. However, to achieve a comparable efficiency performance with in-memory index, applying NVM to both latency and I/O bandwidth critical applications still face non-trivial challenges, because NVM has much lower I/O speed and bandwidth compared to DRAM. In this paper, we propose an NVM SSD-optimized query processing framework, aiming to address both the latency and bandwidth issues of using NVM in search engines. First, we propose a pipelined query processing methodology which significantly reduces the I/O waiting time. Second, we propose a cache-aware query reordering algorithm which enables queries sharing more data to be processed adjacently. Third, we propose a data prefetching mechanism which reduces the extra thread waiting time and improves bandwidth utilization. Moreover, we propose intra-query parallel mechanisms for long-tail queries, including query subtask scheduling, heap concurrent access strategy, query parallelism prediction and adaptive pipelining. Extensive experimental studies show that our framework significantly outperforms the state-of-the-art baselines, which obtains comparable processing latency and throughput with DRAM in both inter-query and intra-query parallel scenarios.

Full Text