Abstract

A directory structure is traditionally employed for tracking coherence information of the privately cached blocks in a cache-coherent chip-multiprocessor (CMP). The eviction of a directory entry necessarily invalidates the privately cached copies of the block that the evicted entry was tracking. These forced directory eviction victims pose two major challenges. First, with decreasing directory size, the volume of these victim blocks increases significantly causing performance degradation. As a result, sizing the directory remains an important challenge. Second, the tight coupling between the directory evictions and the private cache contents can be exploited to launch timing-based side-channel attacks, as has been demonstrated recently. The existing solutions to the first problem allow reducing the directory capacity only up to a certain extent before the performance starts degrading. The existing mitigation technique for the security vulnerability avoids generation of only a certain specific subset of directory victims. In this paper, we present the Zero Directory Eviction Victim (Ze-roDEV) coherence protocol and accompanying novel mechanisms that guarantee freedom from invalidations arising from directory victims, thereby completely isolating the private core caches from the coherence directory evictions. This is the first fully hardwired design proposal that enables a practically unbounded coherence directory which, to the core caches in a CMP, appears to never evict a live entry. Unlike the prior proposals that have completely eliminated the directory and the coherence information eviction victims in a multi-/many-core CMP, our proposal does not require any operating system or application software changes. Our proposal, instead, repurposes the on-die last-level cache (LLC) space for holding the evicted directory entries and engineers a novel mechanism to handle directory entry eviction from the LLC without generating any invalidation to the private core caches. The ZeroDEV protocol evaluated on multi-threaded and multi-programmed workloads for inclusive and two popular non-inclusive CMP cache hierarchy designs performs within 1-2% of a well-provisioned traditional baseline. Importantly, as an additional benefit of eliminating directory eviction victims and utilizing the large on-die LLC for caching directory entries, we show that our proposal does not need any dedicated directory structure at all for certain classes of CMP cache hierarchy designs while maintaining the performance level and continuing to guarantee complete isolation of the core caches from directory entry eviction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call