Abstract

We discuss an approach for solving sparse or dense banded linear systems $Ax = b$ on a graphics processing unit (GPU) card. The matrix ${A} \in {\mathbb{R}}^{N \times N}$ is possibly nonsymmetric and moderately large, i.e., ${10,000}{} \leq N \leq {500,000}{}$. The split and parallelize (SaP) approach seeks to partition the matrix $A$ into diagonal subblocks $A_i$, $i=1,\ldots,P$, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal subblocks $A_i$. This approach, along with the Krylov-subspace-based iterative method that it preconditions, are implemented in a solver called SaP::GPU, which is compared in terms of efficiency with three commonly used sparse direct solvers: PARDISO, SuperLU, and MUMPS. SaP::GPU, which runs entirely on the GPU except for several stages involved in preliminary row and column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel's MKL, SaP::GPU also fared well when used to solve dense banded systems that are close to being diagonally dominant. SaP::GPU is publicly available and distributed as open source under a permissive BSD-3 license.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.