Abstract

The Flexible Generalized Minimal Residual method (FGMRES) is an attractive iterative solver for non-symmetric systems of linear equations. This paper presents several methods for parallelizing FGMRES for a variety of archi- tectures including multi-core CPU, Graphics Processing Units (GPU), and multi-GPU systems. The parallel imple- mentations utilize OpenMP and CUDA kernels, and are organized according to thread scope. The linear systems employed in this study correspond to the discrete analogues of realistic three-dimensional convection-diffusion problems, and range in size to nearly 107 linear equations. All of the parallel implementations, particularly the novel hybrid approach, show a significant speedup over the sequential version. Theoretical insight and perfor- mance data is provided to allow informed decisions as to the most effective parallelization method for a given architecture.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call