DCAPS

Yaocheng Xiang,Zhenlin Wang,Xiaolin Wang,Zeyu Wang,Yingwei Luo,Zihui Huang

doi:10.1145/3190508.3190511

Abstract

In a multicore system, effective management of shared last level cache (LLC), such as hardware/software cache partitioning, has attracted significant research attention. Some eminent progress is that Intel introduced Cache Allocation Technology (CAT) to its commodity processors recently. CAT implements way partitioning and provides software interface to control cache allocation. Unfortunately, CAT can only allocate at way level, which does not scale well for a large thread or program count to serve their various performance goals effectively. This paper proposes Dynamic Cache Allocation with Partial Sharing (DCAPS), a framework that dynamically monitors and predicts a multi-programmed workload's cache demand, and reallocates LLC given a performance target. Further, DCAPS explores partial sharing of a cache partition among programs and thus practically achieves cache allocation at a finer granularity. DCAPS consists of three parts: (1) Online Practical Miss Rate Curve (OPMRC), a low-overhead software technique to predict online miss rate curves (MRCs) of individual programs of a workload; (2) a prediction model that estimates the LLC occupancy of each individual program under any CAT allocation scheme; (3) a simulated annealing algorithm that searches for a near-optimal CAT scheme given a specific performance goal. Our experimental results show that DCAPS is able to optimize for a wide range of performance targets and can scale to a large core count.

Full Text