We discuss an approach for solving sparse or dense banded linear systems $Ax = b$ on a graphics processing unit (GPU) card. The matrix ${A} \in {\mathbb{R}}^{N \times N}$ is possibly nonsymmetric and moderately large, i.e., ${10,000}{} \leq N \leq {500,000}{}$. The split and parallelize (SaP) approach seeks to partition the matrix $A$ into diagonal subblocks $A_i$, $i=1,\ldots,P$, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal subblocks $A_i$. This approach, along with the Krylov-subspace-based iterative method that it preconditions, are implemented in a solver called SaP::GPU, which is compared in terms of efficiency with three commonly used sparse direct solvers: PARDISO, SuperLU, and MUMPS. SaP::GPU, which runs entirely on the GPU except for several stages involved in preliminary row and column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel's MKL, SaP::GPU also fared well when used to solve dense banded systems that are close to being diagonally dominant. SaP::GPU is publicly available and distributed as open source under a permissive BSD-3 license.