Parallel Performance of Domain-Decomposed Preconditioned Krylov Methods for PDEswith Locally Uniform Refinement

William D Gropp,David E Keyes

doi:10.1137/0913008

Abstract

Preconditioners based on domain decomposition appear natural for the Krylov solution of implicitly discretized partial differential equations (PDEs) on parallel computers. Two-scale preconditioners (involving a global coarse-grid solve, independent solves over interfaces connecting the coarse-grid points, and independent subdomain solves) have been known since the early 1980s to be “near optimal” in the sense of ensuring a bounded, or at most logarithmically growing, iteration count as the mesh is refined. As a result, the refinement of the mesh can be chosen locally on the basis of truncation error, and the granularity of the domain decomposition can be chosen globally on the basis of parallel computing considerations with only mild effects on the convergence rate of the algorithm. However, overall computational complexity depends not only on the algebraic convergence rate, but also on the operation counts of the components of the preconditioner that must be applied at each iteration. The costs of solving the subdomain systems and the crosspoint system show superlinear growth in their respective (and inversely related) sizes. On the subdomains, the superlinear terms arise from arithmetic only; in the crosspoint system the cost of nonlocal data exchange is also superlinear. These factors make the preconditioner granularity and the choice of its components problem- and machine-dependent compromises. The tradeoffs involved are illustrated through numerical experiments on both shared- and distributed-memory computers for convection-diffusion problems. Because of the development of boundary layers, these problems benefit from local mesh refinement, which is straightforward to accommodate within the domain decomposition framework in a locally uniform sense, but which introduces load balancing as a further consideration in selecting the granularity of the preconditioner. In spite of the tradeoffs, cumulative speedups are obtainable out to at least medium-scale granularity (up to 64 processors in our tests). The largest problems involve $\mathcal{O}(10^5 )$ unknowns partitioned into $\mathcal{O}(10^3 )$ subdomains and converge in $\mathcal{O}(10)$ iterations requiring $\mathcal{O}(1)$ seconds on the Intel iPSC/860.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parallel Performance of Domain-Decomposed Preconditioned Krylov Methods for PDEswith Locally Uniform Refinement

Abstract

Talk to us

Similar Papers

More From: SIAM Journal on Scientific and Statistical Computing

Lead the way for us

Journal: SIAM Journal on Scientific and Statistical Computing	Publication Date: Jan 1, 1992
Citations: 11

Similar Papers

A Test of Moving Mesh Refinement for 2-D Scalar Hyperbolic Problems
William D Gropp
SIAM Journal on Scientific and Statistical Computing | VOL. 1
William D GroppWilliam D Gropp
01 Jun 1980
SIAM Journal on Scientific and Statistical Computing | VOL. 1

Efficient explicit time integration for the simulation of acoustic and electromagnetic waves

-

01 Jan 2015
01 Jan 2015

Domain Decomposition with Local Mesh Refinement
William D Gropp ... David E Keyes
SIAM Journal on Scientific and Statistical Computing | VOL. 13
William D Gropp, et. al.William D Gropp ... David E Keyes
01 Jul 1992
SIAM Journal on Scientific and Statistical Computing | VOL. 13

Heterogeneous Porous Media and Domain Decomposition Methods
M S Espedal ... O Saevareid
-
M S Espedal, et. al.M S Espedal ... O Saevareid
01 Jan 1990
01 Jan 1990

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parallel Performance of Domain-Decomposed Preconditioned Krylov Methods for PDEswith Locally Uniform Refinement

Abstract

Talk to us

Similar Papers

More From: SIAM Journal on Scientific and Statistical Computing