Abstract

On-chip dynamic random access memory (DRAM) cache has been recently employed in the memory hierarchy to mitigate the widening latency gap between high-speed cores and off-chip memory. Two important parameters are the DRAM cache miss rate (D$-MR) and the DRAM cache hit latency (D$-HL), as they strongly influence the performance. These parameters depend upon the DRAM set mapping policy. Recently proposed DRAM set mapping policies are predominantly optimized for either D$-MR or D$-HL. We propose novel DRAM set mapping policies that simultaneously reduce D$-MR (via high associativity) and D$-HL (via improved row buffer hit rates). To further improve the D$-HL, we propose a small and low latency DRAM Tag cache (DTC) structure that can quickly determine whether an access to the DRAM cache will be a hit or a miss. The performance of the proposed DTC depends upon the DTC hit rate. To increase it, we present a novel DTC insertion policy that also increases the DTC hit rate. We investigate the latency and miss rate tradeoffs when designing a DRAM cache hierarchy and analyze the effects of different policies on the overall performance. We evaluate our policies on a wide variety of workloads and compare its performance with three recent proposals for on-chip DRAM caches. For a 16-core system, our set mapping policy along with our DTC and its adaptive DTC insertion policy improve the harmonic mean instruction per cycle throughput by 25.4%, 15.5%, and 7.3% compared to state-of-the-art, while requiring 55% less storage overhead for DRAM cache hit/miss prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call