Abstract

In this paper, two GPU parallel algorithms are proposed for the discrete unified gas kinetic scheme (DUGKS) for simulating low-speed isothermal flows. Algorithm-I uses a two-level fine-grain technique for the parallelization of physical spatial space, while Algorithm-II adopts this technique for both physical spatial and particle velocity spaces. To evaluate the performance of the proposed algorithms, several typical benchmark problems are simulated, including the two-dimensional (2D) and three-dimensional (3D) lid-driven cavity flows, the micro channel and cavity flows. Numerical results show that our GPU algorithms can achieve satisfactory computational efficiency. For Algorithm-I, the speedup can reach 250 and 338 on a Tesla V100 GPU card for the 2D and 3D continuum cavity flows, respectively, and a hundredfold acceleration can be obtained for the rarefied cases. While for Algorithm-II, a speedup of about 70 can be attained for rarefied cases. However, it is not applied to continuum problems that only require a small number of velocity points. Moreover, comparisons between the two GPU algorithms are also conducted for the rarefied flows with various grid meshes and velocity directions. The results show that Algorithm-I performs better when physical mesh size is large, while Algorithm-II can provide higher efficiency for a coarser mesh with medium number of discrete velocities. Special attention is also paid to comparisons between Algorithm-I and MPI parallelization with 128 CPU cores based on physical space discretization approach, and it is found that Algorithm-I has a clear advantage on V100 GPU when dealing with sparse physical grids in both continuum and rarefied cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call