Abstract

In our previous work we have studied the performance of a parallel program, based on a direction splitting approach, solving time dependent Stokes equation. In it, we have used a rectangular uniform mesh, combined with a central difference scheme for the second derivatives. In our work, we were targeting massively parallel computers, as well as clusters of multi-core nodes. Therefore, the developed implementation used hybrid parallelization based on the MPI and OpenMP standards. Specifically, (i) between-node parallelism was supported by using MPI-based communication, while (ii) inside-node parallelism was supported by the OpenMP. In this way, by matching “structure of parallelization” with the architecture of modern large-scale computers, we have attempted at maximizing parallel efficiency of the program.This paper presents an experimental performance study of the developed parallel implementation on a supercomputer using Intel Xeon processors, as well as Intel Xeon Phi co-processors. The experimental results show an essential improvement when running experiments for a variety of problem sizes and number of cores / threads.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.