Abstract

The least recently used (LRU) replacement policy performs poorly in the last-level cache (LLC) because temporal locality of memory accesses is filtered by first and second level caches. We propose a cache segmentation technique that dynamically adapts to cache access patterns by predicting the best number of not-yet-referenced and already-referenced blocks in the cache. This technique is independent from the LRU policy so it can work with less expensive replacement policies. It can automatically detect when to bypass blocks to the CPU with no extra overhead. In a 2MB LLC single-core processor with a memory intensive subset of SPEC CPU 2006 benchmarks, it outperforms LRU replacement on average by 5.2% with not-recently-used (NRU) replacement and on average by 2.2% with random replacement. The technique also complements existing shared cache partitioning techniques. Our evaluation with 10 multi-programmed workloads shows that this technique improves performance of an 8MB LLC four-core system on average by 12%, with a random replacement policy requiring only half the space of the LRU policy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call