Abstract

Vertical autoscaling scales the amount of resources reserved by a virtual machine in the cloud. Although there have been studies on vertical autoscaling of CPU and memory resources, these have yet to consider GPU resources. In this paper, we propose a vertical autoscaling algorithm to improve the utilization of GPU resources within budget limit by exploiting Lyapunov optimization. Our algorithm deals with the correlation between GPU and CPU resources and requires only resource utilization information to decide to scale GPU resources. The performance measurement results show that our GPU vertical autoscaler can provide optimal performance to containerized CNN-based machine learning applications, such as ResNet-50 and Yolo-v4, with respect to execution time and throughput.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call