Abstract

Modern computing systems and applications have growing demand for memories with higher bandwidth. This demand can be alleviated using fast, large on-die or die-stacked memories. They are typically used with traditional DRAM as part of a heterogeneous memory system and used either as a DRAM cache or as a hardware- or OS-managed part of memory (PoM). Caches adapt rapidly to application needs and typically provide higher performance but reduce the total OS-visible memory capacity. PoM architectures increase the total OS-visible memory capacity but exhibit additional overheads due to swapping large blocks of data between fast and slow memory. In this paper, we propose Chameleon, a hybrid architecture that bridges the gap between cache and PoM architectures. When applications need a large memory, Chameleon uses both fast and slow memories as PoM, maximizing the available space for the application. When the application's footprint is smaller than the total physical memory capacity, Chameleon opportunistically uses free space in the system as a hardware-managed cache. Chameleon is a hardware-software co-designed system where the OS notifies the hardware of pages that are allocated or freed, and hardware decides on switching memory regions between PoM- and cache-modes dynamically. Based on our evaluation of multi-programmed workloads on a system with 4GB fast memory and 20GB slow memory, Chameleon improves the average performance by 11.6% over PoM and 24.2% over a latency-optimized cache.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call