Abstract

Using a compute infrastructure efficiently to execute jobs while respecting Service Level Agreements (SLAs) and thereby guaranteeing Quality of Service (QoS) poses a number of challenges. One such challenge lies in the fact that SLAs are set prior to the execution of a job, but the execution environment is subject to a number of possible disturbances, such as poor knowledge about actual resource necessity, demand peaks and hardware malfunctions, amongst others. Thus by using a fixed resource allocation, the manager of a shared computing environment risks violating user SLAs. Furthermore, the complexity of managing several workload executions increases with the number of workloads, implying the need for an automatic method to manage and control the execution of workloads. The execution time SLA is specially important in streaming scenarios such as web applications and continuous video processing, and is the focus of this paper. A method based on adaptive model predictive control (aMPC) is proposed here to adapt the amount of allocated resources to iterative workloads. The methodology is tested applied to Deep Learning Workloads, in standalone and multi-workload versions. The results show that using adaptive optimal control with a linearized model improves performance with respect to simpler control laws as well as reinforcement learning approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.