Abstract

Prefetching is widely used in modern computer systems to mitigate the impact of long memory access latency by paying extra cost in memory and cache accesses. However, the efficacy of prefetching significantly degrades in the memory hierarchy using the emerging spin-transfer torque random access memory (STT-RAM) as last-level cache (LLC) due to the long write access latency. In this work, we propose two orthogonal but complimentary techniques to improve the prefetching efficacy of STT-RAM based LLC in chip multi-processor (CMP) systems, namely, request prioritization (RP) and hybrid local-global prefetch control (HLGPC). Simulation results show that by combining these two techniques, we can achieve 6.5%~11% system performance improvement and 4.8%~7.3% LLC energy saving in a quadcore system with a 2MB~8MB STT-RAM based LLC, compared to the system with only basic prefetching.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call