Collective Offload for Heterogeneous Clusters

Florentino Sainz,Jesus Labarta,Jorge Bellon,Vicenc Beltran

doi:10.1109/hipc.2015.20

Abstract

Exascale performance requires a level of energy efficiency only achievable with specialized hardware. Hence, for building a general purpose HPC system with Exascale performance different types of processors, memory technologies and interconnection networks will be necessary. Heterogeneous hardware is already present on some top supercomputer systems that are composed of different compute nodes, which at the same time, contain different types of processors and memories. Moreover, heterogeneous hardware is much harder to manage and exploit than homogeneous hardware, further increasing the complexity of applications that run on HPC systems. Most HPC applications use MPI to implement a rigid Single Program Multiple Data (SPMD) execution model that no longer fits the heterogeneous nature of the underlying hardware. However, MPI provides a powerful and flexible MPI_Comm_spawn API call that was designed to exploit heterogeneous hardware dynamically but at the expense of higher complexity, hindering a wider adoption of this API. In this paper, we have extended the OmpSs programming model to offload MPI kernels dynamically, replacing the low-level and more error-prone MPI_Comm_ spawn call with high-level and easier to use OmpSs pragmas. The evaluation shows that our proposal simplifies the dynamic offload of MPI kernels while keeping competitive performance and scaling to a high number of nodes.

Full Text