Towards an efficient static scheduling scheme for delivering queries to heterogeneous clusters in the similarity search problem

Roberto Uribe-Paredes,José L Sánchez,Enrique Arias,Diego Cazorla

doi:10.1007/s11227-013-1079-4

Abstract

Medium and large clusters incorporating hybrid CPU/graphics processing unit (GPU) nodes are present in many datacenters today. They can accelerate many different kinds of applications and appropriately manage applications dealing with a high volume of data. This is the case of the similarity problem because large databases are managed and very quick responses are required to hundreds or thousands of queries per second. However, the design and usage of heterogeneous computing platforms poses big challenges as system size, energy saving, task mapping, scheduling, among others, must be efficiently handled. In this paper we focus on the scheduling issue for distributing the incoming queries to all the processing components in the cluster nodes. Our algorithms exploit the computational resources, simultaneously processing queries on CPU cores and on the GPUs. Thus, we address the problem of how to distribute the queries over the whole system in order to obtain the best performance, under the assumption of defining a heuristic that automatically provides the best distribution. Experimental results show the benefits in terms of execution time and energy saving of using an appropriate scheduling scheme.

Full Text