Abstract

Performance interference between QoS and best-effort applications is getting more aggravated as data-intensive applications are rapidly and widely spreading in recently emerging computing systems. While the completely fair scheduler (CFS) of the Linux kernel has been extensively used to support performance isolation in a multitasking environment, it falls short of addressing memory-related interference due to memory access contention and insufficient cache coverage. Though quite a few memory-aware performance isolation mechanisms have been proposed in the literature, many of them rely on hardware-based solutions, inflexible resource management or ineffective execution throttling, which makes it difficult for them to be used in widely deployed operating systems like Linux running on a COTS SoC platform. We propose a memory-aware fair-share scheduling algorithm that can make QoS applications less susceptible to memory-related interference from other co-running applications. Our algorithm carefully separates the genuine memory-related stall from a running task's CPU cycles and compensates the task for the memory-related interference so that the task gets the desired share of CPU before it is too late. The proposed approach is adaptive, effective and efficient in the sense that it does not rely on any static allocation or partitioning of memory hardware resources and improves the performance of QoS applications with only a negligible runtime overhead. Moreover, it is a software-only solution that can be easily integrated into the kernel scheduler with only minimal modification to the kernel. We implement our algorithm into the CFS of Linux and name the end result mCFS. We show the utility and effectiveness of the approach via extensive experiments.

Highlights

  • Data-intensive applications, most noticeably deep learning-based applications, are rapidly and widely spreading in recently emerging computing systems

  • MCFS scales the actualized CPU time according to the relative performance of the core hosting the task

  • We show the effectiveness of memory-aware CFS (mCFS) through extensive experiments and measurements with SPEC 2017 benchmark suites

Read more

Summary

INTRODUCTION

Data-intensive applications, most noticeably deep learning-based applications, are rapidly and widely spreading in recently emerging computing systems. An application demonstrating a sequential data access pattern may incur many cache misses during execution, even without cache contention In this case, it is fair to say that the performance isolation mechanism should not compensate the application for such intrinsic CPU stall. To compute the amount of the genuine memory-related stall, mCFS uses a runtime formula we derive via qualitative analysis of the underlying microarchitecture and quantitative analysis of the execution of diverse applications. As the first step in memory-aware virtual runtime calculation, mCFS performs a computation we name CPU time actualization. In this step, the genuine memory-related stall time of a task is deducted from the task’s physical CPU time.

RELATED WORK
BACKGROUND
PROBLM FORMULATION
THE mCFS ARCHITECTURE
ESTIMATING MEMORY-RELATED INTERFERENCE
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call