Optimization of MPI collective operations on the IBM Blue Gene/Q supercomputer

Sameer Kumar,Dong Chen,Amith Mamidala,Philip Heidelberger,Daniel Faraj

doi:10.1177/1094342014552086

Abstract

The Blue Gene/Q (BG/Q) machine is the latest in the line of IBM massively parallel supercomputers, designed to scale to 262,144 nodes and 16 million threads. Each BG/Q node has 68 hardware threads. Hybrid programming paradigms, which use message passing among nodes and multi-threading within nodes, enable applications to achieve high throughput on BG/Q. In this paper, we present scalable algorithms to optimize MPI collective operations by taking advantage of the various features of the BG/Q torus and collective networks. We achieve an 8 byte double-sum MPI_Allreduce latency of 10.25 ms on 1,572,864 MPI ranks. We accelerate summing of network packets with local buffers by the use of the Quad Processing SIMD unit in the BG/Q cores and executing the sums on multiple communication threads supported by the optimized communication libraries. The achieved net gain is a peak throughput of 6.3 GB/s for double-sum allreduce. We also achieve over 90% of network peak for MPI_Alltoall with 65,536 MPI ranks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimization of MPI collective operations on the IBM Blue Gene/Q supercomputer

Abstract

Talk to us

Similar Papers

More From: The International Journal of High Performance Computing Applications

Lead the way for us

Journal: The International Journal of High Performance Computing Applications	Publication Date: Nov 1, 2014
Citations: 40

Similar Papers

Recent advances in the Message Passing Interface
Javier Garcia Blas ... Jesus Carretero
The International Journal of High Performance Computing Applications | VOL. 28
Javier Garcia Blas, et. al.Javier Garcia Blas ... Jesus Carretero
01 Nov 2014
The International Journal of High Performance Computing Applications | VOL. 28

Looking under the hood of the IBM Blue Gene/Q network
Dong Chen ... Robert Senger
-
Dong Chen, et. al.Dong Chen ... Robert Senger
01 Nov 2012
01 Nov 2012

Beyond the CPU: Hardware Performance Counter Monitoring on Blue Gene/Q
Heike Mccraw ... Kris Davis
-
Heike Mccraw, et. al.Heike Mccraw ... Kris Davis
01 Jan 2013
01 Jan 2013

Looking under the hood of the IBM blue gene/Q network
...
-
, et. al. ...
10 Nov 2012
10 Nov 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimization of MPI collective operations on the IBM Blue Gene/Q supercomputer

Abstract

Talk to us

Similar Papers

More From: The International Journal of High Performance Computing Applications