The acceleration of CFD solvers on GPUs allows for large speedups, which enable more complex and detailed simulations. However, not all numerical methods can be efficiently executed in a massively parallel fashion. The Runge–Kutta (RK) and Lower-Upper Symmetric Gauss–Seidel (LU-SGS) time-integration methods are widely used on GPUs and multicore CPUs, respectively; however, the RK method suffers from low convergence speed, while the LU-SGS method is not adapted to a many-core environment. In this paper, we propose to accelerate the matrix-free version of the Data-Parallel Lower-Upper Relaxation (DP-LUR) time-integration method on the unstructured solver FaSTAR, developed by JAXA. The DP-LUR method outperformed the RK and LU-SGS methods on an NVIDIA Tesla V100 GPU when tested on the ONERA M6 and NASA CRM geometries, and reached up to 3.83% of the performance peak of one device.
Read full abstract