Abstract
A computational method for GPU-accelerated fractional-step integration of incompressible Navier-Stokes equations based on the Alternating Direction Implicit (ADI) method is presented. Non-iterative, direct solution methods used in the semi-implicit fractional-step method take advantage of tridiagonal systems and Fourier transform whose solution can be computed using fast algorithms on a single GPU. However, when data is distributed to multiple GPUs, all-to-all matrix transposition is required, which increases computational cost significantly. In this work, a new strategy that does not require all-to-all transposition is proposed. The computational domain is divided in the wall-normal direction, and decoupled tridiagonal systems are obtained using Parallel Diagonal Dominant (PDD) and Parallel Partition (PPT) methods. An optimal batch size is determined to maximize the performance of PDD and PPT methods within a given amount of GPU memory. Strengths and weaknesses of this type of domain decomposition are investigated in comparison to conventional ways of dividing the domain along streamwise or spanwise directions. Using 8 NVIDIA Tesla P100 GPUs, the utility of the present method is demonstrated in a direct numerical simulation (DNS) of a canonical zero-pressure-gradient turbulent boundary layer and a DNS of a K-type boundary-layer transition on 1.4 billion grid cells.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have