Abstract

SummaryAll‐to‐all communication is a basic functionality of parallel communication libraries such as the Message Passing Interface (MPI). Typically, there are multiple different underlying algorithms, which are chosen according to message size. We propose a communication algorithm, which exploits the fact that modern supercomputers combine shared memory parallelism and distributed memory parallelism. The application example of our algorithm is FFTs with pencil decomposition. Furthermore, we propose an extension of the MPI standard in order to accommodate this and other algorithms in an efficient way.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call