Farewell My Shared LLC! A Case for Private Die-Stacked DRAM Caches for Servers

Amna Shahab,Mingcan Zhu,Boris Grot,Artemiy Margaritov

doi:10.1109/micro.2018.00052

Abstract

The slowdown in technology scaling mandates rethinking of conventional CPU architectures in a quest for higher performance and new capabilities. This work takes a step in this direction by questioning the value of on-chip shared last-level caches (LLCs) in server processors and argues for a better alternative. Shared LLCs have a number of limitations, including on-chip area constraints that limit storage capacity, long planar interconnect spans that increase access latency, and contention for the shared cache capacity that hurts performance under workload colocation. To overcome these limitations, we propose a Die-Stacked Private LLC Organization (SILO), which combines conventional on-chip private L1 (and optionally, L2) caches with a per-core private LLC in die-stacked DRAM. By stacking LLC slices directly above each core, SILO avoids long planar wire spans. The use of private caches inherently avoids inter-core cache contention. Last but not the least, engineering the DRAM for latency affords low access delays while still providing over 100MB of capacity per core in today's technology. Evaluation results show that SILO outperforms state-of-the-art conventional cache architectures on a range of scale-out and traditional workloads while delivering strong performance isolation under colocation.

Full Text