Architectural support for efficient message passing on shared memory multi-cores

Rubén Titos-Gil,Oscar Palomar,Osman Unsal,Adrian Cristal

doi:10.1016/j.jpdc.2016.02.005

Abstract

Thanks to programming approaches like actor-based models, message passing is regaining popularity outside large-scale scientific computing for building scalable distributed applications in multi-core processors. Unfortunately, the mismatch between message passing models and today’s shared-memory hardware provided by commercial vendors results in suboptimal performance and a waste of energy. This paper presents a set of architectural extensions to reduce the overheads incurred by message passing workloads running on shared memory multi-core architectures. It describes the instruction set extensions and the hardware implementation. In order to facilitate programmability, the proposed extensions are used by a message passing library, allowing programs to take advantage of them transparently. As a proof-of-concept, we use modified MPI libraries and unmodified MPI programs to evaluate the proposal. Experimental results show that a best-effort design can eliminate over 60% of cache accesses caused by message data transmission and reduce the cycles spent in such task by 75%, while the addition of a simple coprocessor can completely off-load data movement from the CPU to avoid up to 92% of cache accesses, and a reduction of 12% of network traffic on average. The design achieves an improvement of 11%–12% in the energy-delay product of on-chip caches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Architectural support for efficient message passing on shared memory multi-cores

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing

Lead the way for us

Journal: Journal of Parallel and Distributed Computing	Publication Date: Mar 9, 2016
Citations: 2

Similar Papers

DiMP: Architectural Support for Direct Message Passing on Shared Memory Multi-cores
Ruben Titos-Gil ... Adrian Cristal
-
Ruben Titos-Gil, et. al.Ruben Titos-Gil ... Adrian Cristal
01 Sep 2015
01 Sep 2015

Optimisation Techniques for Multicore Architectures and Parallel Processing using OpenMP
Sara Tabassum Ataullah ... Mohammed Siddique
-
Sara Tabassum Ataullah, et. al.Sara Tabassum Ataullah ... Mohammed Siddique
07 Dec 2021
07 Dec 2021

A Massively Parallel Restriction-Smoothed Basis Multiscale Solver on Multicore and GPU Architectures
A M Manea
-
A M ManeaA M Manea
19 Oct 2021
19 Oct 2021

Efficient and accurate Word2Vec implementations in GPU and shared-memory multicore architectures
Trevor M Simonton ... Gita Alaghband
-
Trevor M Simonton, et. al.Trevor M Simonton ... Gita Alaghband
01 Sep 2017
01 Sep 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Architectural support for efficient message passing on shared memory multi-cores

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing