Abstract

We have developed a highly scalable 3D Finite Difference GPU code for use in earthquake engineering and disaster management through regional petascale earthquake simulations. This MPI-CUDA code is based on a widely-used wave propagation code called AWP-ODC and restructured for high throughput and efficiency on a heterogeneous computing architecture. We present an effective communication reduction technique for leveraging GPUs with minimal PCI-e overhead, and a novel overlapping method to fully hide data communication latency between GPUs. The optimization concept used in this work can be extended to general stencil computing on a structured grid. The benchmarks demonstrated sustained 100 TFlops in single precision for 49 billion mesh points using 952 GPUs on the NCCS Titan Phase 5 system, which is a 77-fold speedup compared to the CPU version of the code. This multi-GPU implementation has been validated and used for a large-scale verification wave propagation simulation of Mw5.4 Chino Hills earthquake using 128 GPUs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call