Abstract

We present a novel, hardware-agnostic implementation strategy for lattice Boltzmann (LB) simulations, which yields massive performance on homogeneous and heterogeneous many-core platforms. Based solely on C++17 Parallel Algorithms, our approach does not rely on any language extensions, external libraries, vendor-specific code annotations, or pre-compilation steps. Thanks in particular to a recently proposed GPU back-end to C++17 Parallel Algorithms, it is shown that a single code can compile and reach state-of-the-art performance on both many-core CPU and GPU environments for the solution of a given non trivial fluid dynamics problem. The proposed strategy is tested with six different, commonly used implementation schemes to test the performance impact of memory access patterns on different platforms. Nine different LB collision models are included in the tests and exhibit good performance, demonstrating the versatility of our parallel approach. This work shows that it is less than ever necessary to draw a distinction between research and production software, as a concise and generic LB implementation yields performances comparable to those achievable in a hardware specific programming language. The results also highlight the gains of performance achieved by modern many-core CPUs and their apparent capability to narrow the gap with the traditionally massively faster GPU platforms. All code is made available to the community in form of the open-source project stlbm, which serves both as a stand-alone simulation software and as a collection of reusable patterns for the acceleration of pre-existing LB codes.

Highlights

  • A highly challenging aspect of High Performance Computing (HPC) is the need to reformulate and restructure scientific algorithms to perform well on different types of parallel architectures

  • We propose a model for the implementation of lattice Boltzmann (LB) codes within the framework of C++ standard algorithms, and show the potential for accelerating such codes with the help of execution policies

  • Non-thermal, viscous, Newtonian, and homogeneous fluid, the dynamics of which is described by the Navier-Stokes equations, is enclosed in a cubic domain

Read more

Summary

Introduction

A highly challenging aspect of High Performance Computing (HPC) is the need to reformulate and restructure scientific algorithms to perform well on different types of parallel architectures. In the current hardware landscape, a special focus is devoted to many-core platforms, which include homogeneous systems like the AMD Zen processors investigated in this article, or heterogeneous systems which use a many-core device as an accelerator, including GPUs or Intel’s discontinued Xeon Phi platform. Lattice Boltzmann on CPUs and GPUs framework for blood flow simulations in vasculature and in medical devices”. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.