Efficient same-dimensional implicit time advancement parallel scheme and optimization methods for the iteration parameters using a graphics-processing unit

Bohao Zhou,Ke Zhang,Xudong Huang,Ming Zhou,Dianfang Bi

doi:10.1063/5.0107571

Abstract

Many studies have focused on the acceleration of computational fluid dynamics (CFD) using multicore hardware such as graphics-processing units (GPUs) in the field of parallel computing. In GPU acceleration, CFD parallel granularity generally refers to the point granularity parallelization as a unit of the grid. An implicit time advancement scheme is more efficient and faster than an explicit time advancement scheme for CFD. However, for commonly used implicit schemes, such as the lower-upper symmetric Gauss–Seidel (LUSGS) scheme, the parallel dimensionality is reduced, resulting in a highly time-consuming procedure. In this paper, the data-parallel upper-lower relaxation (DPLUR) scheme based on Jacobi iteration is used and then implemented on a GPU. Numerical experiments are carried out and show that the computing speed of point granularity parallelization using the DPLUR scheme, especially implemented on a GPU, is much higher than that of dimensionality reduction using the LUSGS scheme. Moreover, the influence of different Jacobi inner iteration steps (JIIS) on the convergence time is discussed, and two JIIS optimization algorithms are proposed according to the characteristics of convergence. On the basis of the memory access form, a DPLUR red–black (DPRB) scheme with more stable and faster convergence than the conventional DPLUR scheme is developed. Finally, some standard cases are adopted to verify the effectiveness of DPRB schemes with the JIIS optimization algorithm.

Full Text