In General-Purpose computing on Graphics Processing Unit (GPGPU), the use of CPUs is combined with that of GPUs. CPUs are used for sequential code, while GPUs are used for parallel code. GPGPU has been enabled by two key factors: (i) the massively parallel architecture of GPUs, which allows thousands of single cores to run parallel code; and (ii) the development of platforms, such as CUDA, that simplify implementing code for GPUs. GPGPU has established itself as the standard computing system in most computing fields due to the great improvements it brings. However, its use is not without problems, such as GPU underutilization, high cost, power consumption, etc. In this paper we present NGS (Network GPGPU System) to address the underutilization of GPUs in computing centers. NGS orchestrates the concurrent access to GPGPU resources from different nodes of the cluster by leveraging the remote GPU virtualization mechanism and the NVML library by NVIDIA. In this way, NGS enables different nodes of the cluster to access remote GPUs as if they were local at the same time that this access is guaranteed to be carried out without collisions. The main novelty is that NGS offers a global and standard solution independent of the computing environment used. Experimental results show up to 4x improvements compared to popular approaches.
Read full abstract