Abstract

For chip multiprocessor systems (CMPs), the interference on shared resources such as on-chip caches typically leads to unbalanced progress among threads. Because of the inherent synchronization primitives, such as barriers and locks, cores running fast threads have to waste precious cycles to wait for cores with slow progress, which leads to performance and energy inefficiency. For the purpose of improving performance and reducing energy consumption, this paper proposes to adapt the cache coherence policy for threads according to their delay-tolerant levels. Specifically, this paper proposes Thread progrEss Aware Coherence Adaption (TEACA) which utilizes the thread progress information as hints for coherence adaption. TEACA dynamically utilize the memory system statistics to estimate the progress of threads. Based on the estimated thread progress information, TEACA categorizes threads into leader threads and laggard threads. The thread categorization decisions are then leveraged for efficient coherence adaption on CMP systems supporting hybrid coherence protocols. Experimental results show that, on a 64-core CMP system, TEACA outperforms directory protocol in application execution time and a recently proposed hybrid protocol in both application execution time and energy dissipation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call