Cache blocking for flux reconstruction: Extension to Navier-Stokes equations and anti-aliasing

Semih Akkurt,Freddie Witherden,Peter Vincent

doi:10.1016/j.cpc.2024.109332

Abstract

In this article, cache blocking is implemented for the Navier Stokes equations with anti-aliasing support on mixed grids in PyFR for CPUs. In particular, cache blocking is used as an alternative to kernel fusion to eliminate unnecessary data movements between kernels at the main memory level. Specifically, kernels that exchange data are grouped together, and these groups are then executed on small sub-regions of the domain that fit in per-core private data cache. Additionally, cache blocking is also used to efficiently implement a tensor product factorisation of the interpolation operators associated with anti-aliasing. By using cache blocking, the intermediate results between application of the sparse factors are stored in per-core private data cache, and a significant amount of data movement from main memory is avoided. In order to assess the performance gains a theoretical model is developed, and the implementation is benchmarked using a compressible 3D Taylor-Green vortex test case on both hexahedral and prismatic grids, with third-, fourth-, and fifth-order solution polynomials. The expected performance gains based on the theoretical model range from 1.99 to 2.83, and the speedups obtained in practice range from 1.51 to 3.91 compared to PyFR v1.11.0.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cache blocking for flux reconstruction: Extension to Navier-Stokes equations and anti-aliasing

Abstract

Talk to us

Similar Papers

More From: Computer Physics Communications

Lead the way for us

Journal: Computer Physics Communications	Publication Date: Aug 5, 2024
License type: cc-by

Similar Papers

Cache blocking strategies applied to flux reconstruction
Semih Akkurt ... Peter Vincent
Computer Physics Communications | VOL. 271
Semih Akkurt, et. al.Semih Akkurt ... Peter Vincent
28 Oct 2021
Computer Physics Communications | VOL. 271

Combining Software Cache Partitioning and Loop Tiling for Effective Shared Cache Management
Kelefouras Vasilios ... Keramidas Georgios
ACM Transactions on Embedded Computing Systems | VOL. 17
Kelefouras Vasilios, et. al.Kelefouras Vasilios ... Keramidas Georgios
22 May 2018
ACM Transactions on Embedded Computing Systems | VOL. 17

Tuning the cache memory usage in tomographic reconstruction on standard computers with Advanced Vector eXtensions (AVX).
Jose-Ignacio Agulleiro ... Jose-Jesus Fernandez
Data in brief | VOL. 3
Jose-Ignacio Agulleiro, et. al.Jose-Ignacio Agulleiro ... Jose-Jesus Fernandez
08 Jan 2015
Data in brief | VOL. 3

A Partial Page Cache Strategy for NVRAM-Based Storage Devices
Shuo-Han Chen ... Yuan-Hao Chang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 39
Shuo-Han Chen, et. al.Shuo-Han Chen ... Yuan-Hao Chang
01 Feb 2020
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cache blocking for flux reconstruction: Extension to Navier-Stokes equations and anti-aliasing

Abstract

Talk to us

Similar Papers

More From: Computer Physics Communications