HLanc: Heterogeneous Parallel Implementation of the Implicitly Restarted Lanczos Method

Shuai Zhang,Xiaofan Jiao,Yulu Yang,Tao Li,Yifeng Wang

doi:10.1109/icppw.2014.60

Abstract

Graphics Processing Unit (GPU) has been used as a ubiquitous accelerator for general purpose computing, such as linear algebra routines and numerical methods. The implicitly restarted Lanczos method (IRLM) is well suited for solving the partial eigenvalue problem for large symmetric sparse matrices, which is important in many real world applications. In this paper, we present the HLanc library, a parallel implementation of IRLM on the heterogeneous CPU-GPU architecture employing the CUDA programming model. The HLanc library is designed with separated heterogeneous parallel IRLM solvers and sparse matrix-vector multiplication (SPMV) operators. The SPMV operators hide the details about the storage of sparse matrices from the IRLM solvers, so the solvers can work with any spare matrix formats. Especially the SPMV operators and IRLM solvers can be combined arbitrarily for achieving the best performance of CPU-GPU heterogeneous system. The HLanc is evaluated using eight sparse matrices with the NVIDIA GTX 480 and GTX TITAN Black GPUs. The results show that HLanc achieves 15 times speedup than the ARPACK library and scales well across different GPU generations.

Full Text