Prefetching techniques for STT-RAM based last-level cache in CMP systems

Mengjie Mao,Guangyu Sun,Yiran Chen,Yong Li,Alex K Jones

doi:10.1109/aspdac.2014.6742868

Abstract

Prefetching is widely used in modern computer systems to mitigate the impact of long memory access latency by paying extra cost in memory and cache accesses. However, the efficacy of prefetching significantly degrades in the memory hierarchy using the emerging spin-transfer torque random access memory (STT-RAM) as last-level cache (LLC) due to the long write access latency. In this work, we propose two orthogonal but complimentary techniques to improve the prefetching efficacy of STT-RAM based LLC in chip multi-processor (CMP) systems, namely, request prioritization (RP) and hybrid local-global prefetch control (HLGPC). Simulation results show that by combining these two techniques, we can achieve 6.5%~11% system performance improvement and 4.8%~7.3% LLC energy saving in a quadcore system with a 2MB~8MB STT-RAM based LLC, compared to the system with only basic prefetching.

Full Text