Abstract

Private caches are critical components for hiding memory access latency in high performance multiprocessor systems. However, it has been found that, when executing a parallel program, multiple processors may concurrently update a distinct portion of a cache line and cause unnecessary cache invalidation under traditional cache coherence protocols. Such invalidation can be delayed when software enforces a proper order of memory reads and writes using synchronization primitives. Although delaying cache invalidation until the next synchronization instruction avoids unnecessary coherence traffic, it still incurs additional overhead to invalidate and reconcile the inconsistent cache copies. In this paper, a deferred coherence model is presented, which extends the traditional coherence protocol with new partially-modified states to allow multiple writers to simultaneously update different portions of the same cache line. In addition, the proposed model separates the events of write notification and data reconciliation so that the updated data is posted only when another processor asks for the data. Furthermore, an efficient merging mechanism is incorporated to reconcile multiple inconsistent copies of a modified line upon accessing a potentially stale data. Execution-driven simulation of SPLASH-2 applications shows that the deferred coherence model can out-perform the traditional eager coherence model by up to 20%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.