Abstract

We consider the solution of initial value problems (IVPs) of large systems of ordinary differential equations (ODEs) for which memory space requirements determine the choice of the integration method. In particular, we discuss the space-efficient sequential and parallel implementation of embedded Runge—Kutta (RK) methods. Our focus is on the exploitation of a special structure of commonly appearing ODE systems, referred to as ‘‘limited access distance,’’ to improve scalability and memory usage. Such systems may arise, for example, from the semi-discretization of partial differential equations (PDEs). The storage space required by classical RK methods is directly proportional to the dimension n of the ODE system and the number of stages s of the method. We propose an implementation strategy based on a pipelined processing of the stages of the RK method and show how the memory usage of this computation scheme can be reduced to less than three storage registers by an overlapping of vectors without compromising the choice of method coefficients or the potential for efficient stepsize control. We analyze and compare the scalability of different parallel implementation strategies in detailed runtime experiments on different modern parallel architectures.

Highlights

  • We consider the parallel solution of initial value problems (IVPs) of ordinary differential equations (ODEs) defined by y (t) = f (t, y(t)), y(t0) = y0, (1)

  • Starting with the initial value y0, numerical solution methods for ODE IVPs walk through the integration interval [t0, te] using a potentially large number of time steps

  • We have proposed an implementation strategy for RK methods which is based on a pipelined processing of the stages of the method

Read more

Summary

Introduction

Parallel implementations of embedded RK methods which make no assumptions about the method coefficients or the coupling of the ODE system have to exchange the computed parts of the current argument vector between all participating processors at every stage. The scalability of such general implementations is often not satisfactory. After a discussion of related approaches to reduce the storage space of RK methods, Section 3 discusses optimizations of the communication costs and the locality of memory references based on a pipelined processing of the stages This pipelining approach is applicable to ODE systems with a special access pattern as is typical for ODE systems derived by the method of lines.

Related Work
Stepsize control requires additional registers
Motivation
Access Distance and Resulting Block Dependence Structure
Replacing Global by Local Communication
Reduction of the Working Space by BlockBased Pipelining
Motivation and Sequential Implementation
Parallel Implementation
Comparison of Data Set Sizes of Semidiscretized PDEs
Sequential Implementations
D PipeD PipeDls
Opteron-based Systems
IBM p6 575 Cluster
SGI Altix 4700
D Dbc PipeD PipeD4 PipeD4ls
Distributed-Address-Space Implementations
Opteron Cluster
D Dbc PipeD PipeD2 PipeD4 PipeD5 PipeD4ls
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call