Synchronization coherence: A transparent hardware mechanism for cache coherence and fine-grained synchronization

Yao Guo,Vladimir Vlassov,Raksit Ashok,Richard Weiss,Csaba Andras Moritz

doi:10.1016/j.jpdc.2007.08.003

Abstract

The quest to improve performance forces designers to explore finer-grained multiprocessor machines. Ever increasing chip densities based on CMOS improvements fuel research in highly parallel chip multiprocessors with 100s of processing elements. With such increasing levels of parallelism, synchronization is set to become a major performance bottleneck and efficient support for synchronization an important design criterion. Previous research has shown that integrating support for fine-grained synchronization can have significant performance benefits compared to traditional coarse-grained synchronization. Not much progress has been made in supporting fine-grained synchronization transparently to processor nodes: a key reason perhaps why wide adoption has not followed. In this paper, we propose a novel approach called synchronization coherence that can provide transparent fine-grained synchronization and caching in a multiprocessor machine and single-chip multiprocessor. Our approach merges fine-grained synchronization mechanisms with traditional cache coherence protocols. It reduces network utilization as well as synchronization related processing overheads while adding minimal hardware complexity as compared to cache coherence mechanisms or previously reported fine-grained synchronization techniques. In addition to its benefit of making synchronization transparent to processor nodes, for the applications studied, it provides up to 23% improvement in performance and up to 24% improvement in energy efficiency with no L2 caches compared to previous fine-grained synchronization techniques. The performance improvement increases up to 38% when simulating with an ideal L2 cache system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Synchronization coherence: A transparent hardware mechanism for cache coherence and fine-grained synchronization

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing

Lead the way for us

Journal: Journal of Parallel and Distributed Computing	Publication Date: Sep 4, 2007
Citations: 53

Similar Papers

Structured representation in deep neural network systems
Caiwen Ding
-
Caiwen DingCaiwen Ding
10 May 2021
10 May 2021

Analyzing and Leveraging Decoupled L1 Caches in GPUs
Mohamed Assem Ibrahim ... Gabriel H Loh
-
Mohamed Assem Ibrahim, et. al.Mohamed Assem Ibrahim ... Gabriel H Loh
01 Feb 2021
01 Feb 2021

An Adaptive Core-Specific Runtime for Energy Efficiency
Sridutt Bhalachandra ... Allan Porterfield
-
Sridutt Bhalachandra, et. al.Sridutt Bhalachandra ... Allan Porterfield
01 May 2017
01 May 2017

Scalable directory architecture for distributed shared memory chip multiprocessors
Huan Fang ... Mats Brorsson
ACM SIGARCH Computer Architecture News | VOL. 36
Huan Fang, et. al.Huan Fang ... Mats Brorsson
20 Dec 2008
ACM SIGARCH Computer Architecture News | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Synchronization coherence: A transparent hardware mechanism for cache coherence and fine-grained synchronization

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing