Graph-based multi-core higher-order time integration of linear autonomous partial differential equations

Dominik Huber,Martin Schreiber,Martin Schulz

doi:10.1016/j.jocs.2021.101349

Abstract

Modern high-performance computing (HPC) systems rely on increasingly complex nodes with a steadily growing number of cores and matching deep memory hierarchies. In order to fully exploit them, algorithms must be explicitly designed to exploit these features. In this work we address this challenge for a widely used class of application kernels: polynomial-based time integration of linear autonomous partial differential equations.We build on prior work [1] of a cache-aware, yet sequential solution and provide an innovative way to parallelize it, while addressing cache-awareness across a large number of cores. For this, we introduce a dependency graph driven view of the algorithm and then use both static graph partitioning and dynamic scheduling to efficiently map the execution to the underlying platform. We implement our approach on top of the widely available Intel Threading Building Blocks (TBB) library, although the concepts are programming model agnostic and can apply to any task-driven parallel programming approach.We demonstrate the performance of our approach for a 2nd, 4th and 6th order time integration of the linear advection equation on three different architectures with widely varying memory systems and achieve an up to 60% reduction of wall clock time compared to a conventional, state-of-the-art non-cache-aware approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Graph-based multi-core higher-order time integration of linear autonomous partial differential equations

Abstract

Talk to us

Similar Papers

More From: Journal of Computational Science

Lead the way for us

Journal: Journal of Computational Science	Publication Date: Apr 1, 2021
Citations: 3

Similar Papers

Middleware in Modern High Performance Computing System Architectures
Christian Engelmann ... Stephen L Scott
-
Christian Engelmann, et. al.Christian Engelmann ... Stephen L Scott
01 Jan 2007
01 Jan 2007

HPAS
Emre Ates ... Ayse K Coskun
-
Emre Ates, et. al.Emre Ates ... Ayse K Coskun
05 Aug 2019
05 Aug 2019

Comparison of Three Popular Parallel Programming Models on the Intel Xeon Phi
Ashkan Tousimojarad ... Wim Vanderbauwhede
-
Ashkan Tousimojarad, et. al.Ashkan Tousimojarad ... Wim Vanderbauwhede
01 Jan 2014
01 Jan 2014

First Experiences in Performance Benchmarking with the New SPEChpc 2021 Suites
Holger Brunst ... Junjie Li
-
Holger Brunst, et. al.Holger Brunst ... Junjie Li
01 May 2022
First Experiences in Performance Benchmarking with the New SPEChpc 2021 Suites
Holger Brunst ... Junjie Li

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Graph-based multi-core higher-order time integration of linear autonomous partial differential equations

Abstract

Talk to us

Similar Papers

More From: Journal of Computational Science