Abstractions for C++ code optimizations in parallel high-performance applications

Jiří Klepl,Adam Šmelko,Lukáš Rozsypal,Martin Kruliš

doi:10.1016/j.parco.2024.103096

Abstract

Many computational problems consider memory throughput a performance bottleneck, especially in the domain of parallel computing. Software needs to be attuned to hardware features like cache architectures or concurrent memory banks to reach a decent level of performance efficiency. This can be achieved by selecting the right memory layouts for data structures or changing the order of data structure traversal. In this work, we present an abstraction for traversing a set of regular data structures (e.g., multidimensional arrays) that allows the design of traversal-agnostic algorithms. Such algorithms can easily optimize for memory performance and employ semi-automated parallelization or autotuning without altering their internal code. We also add an abstraction for autotuning that allows defining tuning parameters in one place and removes boilerplate code. The proposed solution was implemented as an extension of the Noarr library that simplifies a layout-agnostic design of regular data structures. It is implemented entirely using C++ template meta-programming without any nonstandard dependencies, so it is fully compatible with existing compilers, including CUDA NVCC or Intel DPC++. We evaluate the performance and expressiveness of our approach on the Polybench-C benchmarks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Abstractions for C++ code optimizations in parallel high-performance applications

Abstract

Talk to us

Similar Papers

More From: Parallel Computing

Lead the way for us

Journal: Parallel Computing	Publication Date: Aug 14, 2024
License type: cc-by

Similar Papers

Using randomization in the teaching of data structures and algorithms
Michael T Goodrich ... Roberto Tamassia
-
Michael T Goodrich, et. al.Michael T Goodrich ... Roberto Tamassia
01 Mar 1999
01 Mar 1999

Using randomization in the teaching of data structures and algorithms
Michael T Goodrich ... Roberto Tamassia
ACM SIGCSE Bulletin | VOL. 31
Michael T Goodrich, et. al.Michael T Goodrich ... Roberto Tamassia
01 Mar 1999
ACM SIGCSE Bulletin | VOL. 31

The whiteboard environment: an electronic sketchpad for data structure design and algorithm description
D.R Brown ... B.V Zanden
-
D.R Brown, et. al.D.R Brown ... B.V Zanden
01 Sep 1998
01 Sep 1998

Special Issue on “Algorithm Engineering: Towards Practically Efficient Solutions to Combinatorial Problems”
Mattia D’Emidio ... Daniele Frigioni
Algorithms | VOL. 12
Mattia D’Emidio, et. al.Mattia D’Emidio ... Daniele Frigioni
03 Nov 2019
Algorithms | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Abstractions for C++ code optimizations in parallel high-performance applications

Abstract

Talk to us

Similar Papers

More From: Parallel Computing