We present a highly optimized thread-safe lattice Boltzmann model in which the non-equilibrium part of the distribution function is locally reconstructed via recursivity of Hermite polynomials. Such a procedure allows the explicit incorporation of non-equilibrium moments of the distribution up to the order supported by the lattice. Thus, the proposed approach increases accuracy and stability at low viscosities without compromising performance and amenability to parallelization with respect to standard lattice Boltzmann models. The high-order thread-safe lattice Boltzmann is tested on two types of turbulent flows, namely, the turbulent channel flow at Reτ=180 and the axisymmetric turbulent jet at Re = 7000; it delivers results in excellent agreement with reference data [direct numerical simulations (DNS), theory, and experiments] and (a) achieves peak performance [∼5×1012 floating point operations (FLOP) per second and an arithmetic intensity of ∼7 FLOP/byte on a single graphic processing unit] by significantly reducing the memory footprint, (b) retains the algorithmic simplicity of standard lattice Boltzmann computing, and (c) allows to perform stable simulations at vanishingly low viscosities. Our findings open attractive prospects for high-performance simulations of realistic turbulent flows on GPU-based architectures. Such expectations are confirmed by excellent agreement among lattice Boltzmann, experimental, and DNS reference data.
Read full abstract