Tuning Network I/O Processing to Achieve Performance and Energy Objectives of Latency Critical Workloads

Sambit Kumar Shukla,Matthew Farrens,Dipak Ghosal

doi:10.1109/hpcc/smartcity/dss.2019.00207

Abstract

Latency-Critical (LC) cloud applications pose three important challenges: 1) meeting tail latency Service-Level Objective (SLO), 2) attaining predictable tail latency, and 3) achieving high energy efficiency. In this paper we consider multicore end-systems (leaf nodes) and we study how the two important workload-dependent latency sources in network I/O processing, namely, interrupts and queuing, contribute to these problems. Firstly, we show that frequency-scaled centralized interrupt processing can be as energy efficient and achieve more predictable latency compared to traditional distributed interrupt processing. And secondly, we show that a controlled dynamic frequency scaling approach that adapts to socket buffer queue length can mitigate tail latency problems due to queuing. We design and implement a Runtime Engine that employs the proposed techniques through online monitoring of the workload and dynamically allocates resources to meet the tail latency performance. Finally, we present a study for six LC applications with different latency and service characteristics. The study shows that our proposed scaling approach saves up to 16% more energy compared to Linux on-demand frequency scaling governor that adapts based on the CPU utilization.

Full Text