An architecture for high-performance scalable shared-memory multiprocessors exploiting on-chip integration

M.E Acacio,J Gonzalez,J Duato,J.M Garcia

doi:10.1109/tpds.2004.27

Abstract

Recent technology improvements allow multiprocessor designers to put some key components inside the processor chip, such as the memory controller, the coherence hardware, and the network interface/router. In this paper, we exploit such integration scale, presenting a novel node architecture aimed at reducing the long L2 miss latencies and the memory overhead of using directories that characterize cc-NUMA machines and limit their scalability. Our proposal replaces the traditional directory with a novel three-level directory architecture, as well as it adds a small shared data cache to each of the nodes of a multiprocessor system. Due to their small size, the first-level directory and the shared data cache are integrated into the processor chip in every node, which enhances performance by saving accesses to the slower main memory. Scalability is guaranteed by having the second and third-level directories out of the processor chip and using compressed data structures. A taxonomy of the L2 misses, according to the actions performed by the directory to satisfy them, is also presented. Using execution-driven simulations, we show that significant latency reductions can be obtained by using the proposed node architecture, which translates into reductions of more than 30 percent in several cases in the application execution time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An architecture for high-performance scalable shared-memory multiprocessors exploiting on-chip integration

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Aug 1, 2004
Citations: 56

Similar Papers

A. novel approach to reduce L2 miss latency in shared-memory multiprocessors
M.E Acacio ... J Duato
-
M.E Acacio, et. al.M.E Acacio ... J Duato
01 Jan 2002
01 Jan 2002

Reducing the latency of L2 misses in shared-memory multiprocessors through on-chip directory integration
M.E Acacio ... J.M Garcia
-
M.E Acacio, et. al.M.E Acacio ... J.M Garcia
09 Jan 2002
09 Jan 2002

Light-mesh — A pragmatic optical access network architecture for IP-centric service oriented communication
Ashwin Gumaste ... Nasir Ghani
Optical Switching and Networking | VOL. 5
Ashwin Gumaste, et. al.Ashwin Gumaste ... Nasir Ghani
02 Feb 2008
Optical Switching and Networking | VOL. 5

A novel node architecture for all-optical switching networks
Chi Yuan ... Yongqi He
-
Chi Yuan, et. al.Chi Yuan ... Yongqi He
19 Nov 2007
19 Nov 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An architecture for high-performance scalable shared-memory multiprocessors exploiting on-chip integration

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems