Workload-aware Dynamic GPU Resource Management in Component-based Applications

Hoda Sedighi,Roch Glitho,Daniel Gehberger

doi:10.1109/ic2e55432.2022.00030

Abstract

In edge and cloud environments, using graphics processing units (GPUs) as a high-speed parallel computing device increases the performance of compute-intensive applications. Nowadays, due to the increase in the volume and complexity of data to be processed, GPUs are more actively used in component-based applications. As a result, the sequence of multiple interdependent components is co-located on the GPU and shares GPU resources. The overall application performance in this kind of application depends on the data transfer overhead and the performance of each component in the sequence. Managing the components' competitive use of shared GPU resources faces various challenges. The lack of a low-overhead and online technique for dynamic GPU resource allocation leads to imbalanced GPU usage and penalizes the overall performance. In this paper, we present efficient GPU memory and resource managers that improve overall system performance by using shared memory and dynamically assigning portions of shared GPU resources. The portions are based on the components' workload and throughput-based performance analyzer while guaranteeing the application's progress. The evaluation results show that our dynamic resource allocation method is able to improve the average performance of the applications with the various number of concurrent components by up to 29.81% over the default GPU concurrent multitasking. We also show that using shared memory results in 2x performance improvements.

Full Text