Vertical Autoscaling of GPU Resources for Machine Learning in the Cloud

Hyeon-Jun Jang,Hyun-Wook Jin,Yin-Goo Yim

doi:10.1109/bigdata50022.2020.9378248

Abstract

Vertical autoscaling scales the amount of resources reserved by a virtual machine in the cloud. Although there have been studies on vertical autoscaling of CPU and memory resources, these have yet to consider GPU resources. In this paper, we propose a vertical autoscaling algorithm to improve the utilization of GPU resources within budget limit by exploiting Lyapunov optimization. Our algorithm deals with the correlation between GPU and CPU resources and requires only resource utilization information to decide to scale GPU resources. The performance measurement results show that our GPU vertical autoscaler can provide optimal performance to containerized CNN-based machine learning applications, such as ResNet-50 and Yolo-v4, with respect to execution time and throughput.

Full Text