Abstract

This paper shows the development of a multi-GPU version of a time-explicit finite volume solver for the Shallow-Water Equations (SWE) on a multi-GPU architecture. MPI is combined with CUDA-Fortran in order to use as many GPUs as needed and the METIS library is leveraged to perform a domain decomposition on the 2D unstructured triangular meshes of interest. A CUDA-Aware version of OpenMPI is adopted to speed up the messages between the MPI processes. A study of both speed-up and efficiency is conducted; first, for a classic dam-break flow in a canal, and then for two real domains with complex bathymetries. In both cases, meshes with up to 12 million cells are used. Using 24 to 28 GPUs on these meshes leads to an efficiency of 80% and more. Finally, the multi-GPU version is compared to the pure MPI multi-CPU version, and it is concluded that in this particular case, about 100 CPU cores would be needed to achieve the same performance as one GPU. The developed methodology is applicable for general time-explicit Riemann solvers for conservation laws.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call