Abstract

Seismic imaging applications are computationally costly, and the industry’s demand is continuously increasing due to the availability of better data, larger data, and the need for better resolution images. It means that the computational capacity needed tends to increase both in terms of FLOPS calculation and memory. Nowadays, many HPC clusters have nodes with multiple GPUs (e.g., 2, 4, and 8). In this paper, we investigate mechanisms and strategies for the data exchange (of the halo zones) of a finite differences grid of a wave simulator implemented in OpenMP. We compare the performance and programming effort of four data mapping mechanisms supported by OpenMP and CUDA. Our best strategy has achieved speedups of 3.87 on four V100 GPUs with NVLink.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call