Abstract

Compute Unified Device Architecture (CUDA) implementations are presented of a well-balanced finite volume method for solving a shallow water model. The CUDA platform allows programs to run parallel on GPU. Four versions of the CUDA algorithm are presented in addition to a CPU implementation. Each version is improved from the previous one. We present the following techniques for optimizing a CUDA program: limiting register usage, changing the global memory access pattern, and using loop unroll. The accuracy of all programs is investigated in 3 test cases: a circular dam break on a dry bed, a circular dam break on a wet bed, and a dam break flow over three humps. The last parallel version shows 3.84x speedup over the first CUDA implementation. We use our program to simulate a real-world problem based on an assumed partial breakage of the Srinakarin Dam located in Kanchanaburi province, Thailand. The simulation shows that the strong interaction between massive water flows and bottom elevations under wet and dry conditions is well captured by the well-balanced scheme, while the optimized parallel program produces a 57.32x speedup over the serial version.

Highlights

  • Shallow water equations are derived from the conservation of mass and momentum under hydrostatic approximation

  • Serial and parallel programs for simulating shallow water flows are presented in this work

  • We develop two serial programs which are cell-based and edge-based versions

Read more

Summary

Introduction

Shallow water equations are derived from the conservation of mass and momentum under hydrostatic approximation. This model can be used to describe fluid flow under various scenarios over time. Due to the simplicity of the rectangular cell, we implement a well-balanced scheme based on the weighted average flux (WAF) of the finite volume method extended from [8,9,10,11] to simulate various complex flow problems. Several types of research implement unstructured grid; see [2, 3], but the rectangular domain is simple and not much difficult to develop parallel programming

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call