AmgX: A Library for GPU Accelerated Algebraic Multigrid and Preconditioned Iterative Methods

M Naumov,M Arsaev,P Castonguay,N Sakharnykh,V Sellappan,N Markovskiy,J Cohen,I Reguly,S Layton,J Demouth,J Eaton,R Strzodka

doi:10.1137/140980260

Abstract

The solution of large sparse linear systems arises in many applications, such as computational fluid dynamics and oil reservoir simulation. In realistic cases the matrices are often so large that they require large scale distributed parallel computing to obtain the solution of interest in a reasonable time. In this paper we discuss the design and implementation of the AmgX library, which provides drop-in GPU acceleration of distributed algebraic multigrid (AMG) and preconditioned iterative methods. The AmgX library implements both classical and aggregation-based AMG methods with different selector and interpolation strategies, along with a variety of smoothers and preconditioners, including block-Jacobi, Gauss--Seidel, and incomplete-LU factorization. The library contains many of the standard and flexible preconditioned Krylov subspace iterative methods, which can be combined with any of the available multigrid methods or simpler preconditioners. The parallelism in the aggregation scheme exploits parallel graph matching techniques, while the smoothers and preconditioners often rely on parallel graph coloring algorithms. The AMG algorithm implemented in the AmgX library achieves $2$--$5\times$ speedup on a single GPU against a competitive implementation on the CPU. As will be shown in the numerical experiments section, both setup and solve phases scale well across multiple nodes, sustaining this performance advantage.

Full Text