Abstract

The lattice Boltzmann method (LBM) has become an attractive and promising approach in computational fluid dynamics (CFD). In this paper, parallel algorithm of D3Q19 multi-relaxation-time LBM with large eddy simulation (LES) is presented to simulate 3D flow past a sphere using multi-GPUs (graphic processing units). In order to deal with complex boundary, the judgement method of boundary lattice for complex boundary is devised. The 3D domain decomposition method is applied to improve the scalability for cluster, and the overlapping mode is introduced to hide the communication time by dividing the subdomain into two parts: inner part and outer part. Numerical results show good agreement with literature and the 12 Kepler K20M GPUs perform about 5100 million lattice updates per second, which indicates considerable scalability.

Highlights

  • Driven by the market demand for real-time, high-definition 3D graphics at processing large graphics data sets, graphics processing unit (GPU) has been developed for rending tasks

  • Wu and Shao [19] simulated the lid-driven cavity flow using MRT-lattice Boltzmann method (LBM) compared with single-relaxation-time LBM (SRT-LBM) by parallel implementation

  • As a result of the dimension of the problems treated with the LBM, a single piece of GPU cannot deal with the problems and high computing power and large memory space are required

Read more

Summary

Introduction

Driven by the market demand for real-time, high-definition 3D graphics at processing large graphics data sets, graphics processing unit (GPU) has been developed for rending tasks. There are several variations of LBM including lattice Bhatnagar-Gross-Krook (LBGK) [6] or single-relaxation-time LBM (SRT-LBM) [7], entropic LBM (ELBM) [8], two-relaxationtime LBM (TRT-LBM) [9], and multiple-relaxation-time LBM (MRT-LBM) [10, 11] In these methods, MRT-LBM can improve stability and give accurate results in solving higher Reynolds number flow simulations [12]. The parallel algorithm of MRT-LBM-LES on multi-GPUs is studied. Wu and Shao [19] simulated the lid-driven cavity flow using MRT-LBM compared with SRT-LBM by parallel implementation. Tran et al developed high performance parallelization of LBM on a GPU by reducing the overheads associated with the uncoalesced memory accesses and improving the cache locality using the tiling optimization with the data layout change [28].

MRT-LBM with LES
Multi-GPUs Architecture
CPU processor 3
MRT-LBM with LES on Multi-GPUs
Numerical Results and Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call