This article addresses the challenge of allowing simultaneous and predictable accesses to shared data on multi-core systems. We propose a collection of predictable cache coherence protocols, which mandate the use of certain design invariants to ensure predictability. In particular, we enforce these invariants by augmenting the classic modify-share-invalid (MSI) protocol and modify-exclusive-share-invalid (MESI) protocol. This allows us to derive worst-case latency bounds on the resulting predictable MSI (PMSI) and predictable MESI (PMESI) protocols. Our analysis shows that while the arbitration latency scales linearly, the coherence latency scales quadratically with the number of cores, which emphasizes the importance of accounting for cache coherence effects on latency bounds. We implement PMSI and PMESI in a detailed micro-architectural simulator, and execute SPLASH-2 and synthetic workloads. Results show that our approach is always within the analytical worst-case latency bounds, and that PMSI and PMESI improve average-case performance by up to 4× over cache bypassing mechanisms that disallow caching of shared data in the cores’ private caches. PMSI and PMESI have average slowdowns of 1.45× and 1.46× compared to conventional MSI and MESI protocols, respectively.
Read full abstract