Analysis of GPU Data Access Patterns on Complex Geometries for the D3Q19 Lattice Boltzmann Algorithm

Gregory Herschlag,Amanda Randles,Seyong Lee,Jeffrey S Vetter

doi:10.1109/tpds.2021.3061895

Abstract

GPU performance of the lattice Boltzmann method (LBM) depends heavily on memory access patterns. When implemented with GPUs on complex domains, typically, geometric data is accessed indirectly and lattice data is accessed lexicographically. Although there are a variety of other options, no study has examined the relative efficacy between them. Here, we examine a suite of memory access schemes via empirical testing and performance modeling. We find strong evidence that semi-direct is often better suited than the more common indirect addressing, providing increased computational speed and reducing memory consumption. For the layout, we find that the Collected Structure of Arrays (CSoA) and bundling layouts outperform the common Structure of Array layout; on V100 and P100 devices, CSoA consistently outperforms bundling, however the relationship is more complicated on K40 devices. When compared to state-of-the-art practices, our recommendations lead to speedups of 10-40 percent and reduce memory consumption up to 17 percent. Using performance modeling and computational experimentation, we determine the mechanisms behind the accelerations. We demonstrate that our results hold across multiple GPUs on two leadership class systems, and present the first near-optimal strong results for LBM with arterial geometries run on GPUs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analysis of GPU Data Access Patterns on Complex Geometries for the D3Q19 Lattice Boltzmann Algorithm

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Mar 9, 2021
Citations: 60

Similar Papers

GPU Data Access on Complex Geometries for D3Q19 Lattice Boltzmann Method
Gregory Herschlag ... Seyong Lee
-
Gregory Herschlag, et. al.Gregory Herschlag ... Seyong Lee
01 May 2018
01 May 2018

A new approach to the lattice Boltzmann method for graphics processing units
Christian Obrecht ... Jean-Jacques Roux
Computers & Mathematics with Applications | VOL. 61
Christian Obrecht, et. al.Christian Obrecht ... Jean-Jacques Roux
20 Feb 2010
Computers & Mathematics with Applications | VOL. 61

Importance of Selecting Data Layouts in the Tsunami Simulation Code
Takumi Kishitani ... Masayuki Sato
-
Takumi Kishitani, et. al.Takumi Kishitani ... Masayuki Sato
01 May 2020
01 May 2020

Bounding the effect of partition camping in GPU kernels
Ashwin M Aji ... Mayank Daga
-
Ashwin M Aji, et. al.Ashwin M Aji ... Mayank Daga
03 May 2011
03 May 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis of GPU Data Access Patterns on Complex Geometries for the D3Q19 Lattice Boltzmann Algorithm

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems