Abstract
Commercial search engines generally maintain hundreds of thousands of machines equipped with large sized DRAM in order to process huge volume of user queries with fast responsiveness, which incurs high hardware cost since DRAM is very expensive. Recently, NVM Optane SSD has been considered as a promising underlying storage device due to its price advantage over DRAM and speed advantage over traditional slow block devices. However, to achieve a comparable efficiency performance with in-memory index, applying NVM to both latency and I/O bandwidth critical applications such as search engine still faces non-trivial challenges, because NVM has much lower I/O speed and bandwidth compared to DRAM. In this paper, we propose an NVM SSD-optimized query processing framework, aiming to address both the latency and bandwidth issues of using NVM in search engines. Our framework consists of three distinguished properties. First, we propose a pipelined query processing methodology which significantly reduces the I/O waiting time by fine-grained overlapping of the computation and I/O operations. Second, we propose a cache-aware query reordering algorithm which enables queries sharing more data to be processed adjacently so that the I/O traffic is minimized. Third, we propose a data prefetching mechanism which reduces the extra thread waiting time due to data sharing and improves bandwidth utilization. Extensive experimental studies show that our framework significantly outperforms the state-of-the-art baselines, which obtains comparable processing latency and throughput with DRAM while using much less space.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have