Abstract

Latency-Critical (LC) cloud applications pose three important challenges: 1) meeting tail latency Service-Level Objective (SLO), 2) attaining predictable tail latency, and 3) achieving high energy efficiency. In this paper we consider multicore end-systems (leaf nodes) and we study how the two important workload-dependent latency sources in network I/O processing, namely, interrupts and queuing, contribute to these problems. Firstly, we show that frequency-scaled centralized interrupt processing can be as energy efficient and achieve more predictable latency compared to traditional distributed interrupt processing. And secondly, we show that a controlled dynamic frequency scaling approach that adapts to socket buffer queue length can mitigate tail latency problems due to queuing. We design and implement a Runtime Engine that employs the proposed techniques through online monitoring of the workload and dynamically allocates resources to meet the tail latency performance. Finally, we present a study for six LC applications with different latency and service characteristics. The study shows that our proposed scaling approach saves up to 16% more energy compared to Linux on-demand frequency scaling governor that adapts based on the CPU utilization.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.