Abstract
Large off-chip DRAM caches offer performance and bandwidth improvements for many systems by bridging the gap between on-chip last level caches and off-chip memories. To avoid the high hit latency resulting from serial DRAM accesses for tags and data, prior work proposed co-locating tags and data to be accessed together. The state-of-the-art block-based DRAM cache design, the Alloy Cache, reduces hit latency but suffers from increased miss rate due to its direct-mapped design. In this paper, we propose using compression to increase the associativity of a direct-mapped DRAM cache with little impact on hit latency. If the fill and victim lines and the victim tag can be compressed to a single block, the cache effectively becomes a two-way set-associative cache. This mechanism can be extended to compress more lines together and achieve higher associativity. We propose using a low-latency compression algorithm to avoid performance losses. Our analysis on SPECCPU2006 benchmarks shows that nearly 36% of all sets become 2-way, which increases DRAM cache capacity and reduces conflict misses.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have