Abstract

This paper presents a power-efficient parallel cache coherence protocol for network-on-chip interconnection fabric on many core chips. With the increasing numbers of processor cores, a directory-based cache coherence protocol is more scalable and expected to be used for the future chip architectures. However, the characterization of directory-based protocol on a NoC platform shows that many cycles are used to perform command communications to ensure the data consistency before a data packet is transmitted or processed, which restricts the performance. In this paper, a parallel cache coherence protocol is designed to address the problem, which decouples the transmission of the data packets and command packets. Specifically, the parallel mechanism also provides an opportunity to reduce power consumption by using power-optimized wires. A formal speedup model has been established to estimate the performance of this approach. Simulation experiments show significant performance improvement and good system scalability: (1) the average latency reductions are 4.38%, 5.35%, 8.53% and 11.25% for 16, 32, 64, and 128-cores, respectively, (2) the numbers of the L2 cache accesses are greatly reduced, (3) the proposed protocol improves the system performance with up to 18.6%, and (4) the average power savings are 5.66%, 8.14%, 10.73%, and 13.23% for 16, 32, 64 and 128-cores, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call