Abstract

In current multi-core systems with the shared last level cache (LLC) physically distributed across all the cores, both initial data placement and subsequent placement of data close to the requesting core can contribute to reducing memory access latency and power consumption. This paper extends a replication scheme that balances between access latency and cache capacity in shared NUCA designs by selectively replicating frequently used cache lines close to the requesting cores. Our scheme reduces completion time by 15% and improves energy consumption by 27% when compared to the Static-NUCA (S-NUCA) management scheme, when simulated on an eight core system.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call