Abstract

We present variants of the Conjugate Gradient (CG), Conjugate Residual (CR), and Generalized Minimal Residual (GMRES) methods which are both pipelined and flexible. These allow computation of inner products and norms to be overlapped with operator and nonlinear or nondeterministic preconditioner application.The methods are hence aimed at hiding network latencies and synchronizations which can become computational bottlenecks in Krylov methods on extreme-scale systems or in the strong-scaling limit. The new variants are not arithmetically equivalent to their base flexible Krylov methods, but are chosen to be similarly performant in a realistic use case, the application of strong nonlinear preconditioners to large problems which require many Krylov iterations. We provide scalable implementations of our methods as contributions to the PETSc package and demonstrate their effectiveness with practical examples derived from models of mantle convection and lithospheric dynamics with heterogeneous viscosity structure. These represent challenging problems where multiscale nonlinear preconditioners are required for the current state-of-the-art algorithms, and are hence amenable to acceleration with our new techniques. Large-scale tests are performed in the strong-scaling regime on a contemporary leadership supercomputer, where speedups approaching, and even exceeding $2\times$ can be observed. We conclude by analyzing our new methods with a performance model targeted at future exascale machines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call