Abstract

LU factorization is extensively used in engineering and scientific computations for solution of large set of linear equations. Particularly, circuit simulators rely heavily on sparse version of LU factorization for solution involving circuit matrices. One of the recent advances in this field is exploiting the emerging computing platform of graphics processing units (GPUs) for parallel and sparse LU factorization. In this article, following contributions are made to advance the state of the art in hybrid right-looking algorithm (RLA): 1) a novel GPU kernel based on parallel column and block size optimization (PCBSO) is developed for adaptively allocating the block size while optimizing the number of columns for parallel execution based on the size of their associated submatrices at every level. The proposed approach helps to minimize the resource contention and to improve the computational performance and 2) an algorithm is developed to enable the execution of the new adaptive mode with dynamic parallelism. Also, a comprehensive performance comparison using a set of benchmark circuit examples is presented. The results indicate that, the proposed advancements can improve the results of state-of-the-art right looking sparse LU factorization in GPU by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$1.54\times $ </tex-math></inline-formula> (Arithmetic Mean).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call