Large-scale parallel similarity search with Product Quantization for online multimedia services

Guilherme Andrade,André Fernandes,Jeremias M Gomes,Renato Ferreira,George Teodoro

doi:10.1016/j.jpdc.2018.11.009

Abstract

The similarity search in high-dimensional spaces is a core operation found in several online multimedia retrieval applications. With the popularity of these applications, they are required to handle very large and increasing datasets, while keeping the response time low. This problem is worsened in the context of online applications, mostly due to the fact that load on these systems vary during the execution according to the users demands. Those variations require the application to adapt during the execution in order to minimize the response times. In this paper, we address these challenges with an efficient parallelization of the Product Quantization Approximate Nearest Neighbor Search (PQANNS) indexing. This method is capable of answering queries with a reduced memory demand and, coupled with a distributed memory parallelization proposed here, can efficiently handle very large datasets. We have also proposed mechanisms to minimize the query response times in online scenarios in which the query rates vary at run-time. For this sake, our strategies tune the parallelism configurations and task granularity during the execution. The parallelism and granularity tuning approaches (ADAPT and ADAPT+G) have shown, for instance, to reduce the query response times by a factor of 6.4× in comparison with the best static configuration of parallelism and task granularity. Further, the distributed memory execution using 128 nodes/3584 CPU cores has attained a parallel efficiency of 0.97 with a dataset of 256 billion SIFT vectors.

Full Text