An inherently parallel ℋ2-ULV factorization for solving dense linear systems on GPUs

Qianxiang Ma,Rio Yokota

doi:10.1177/10943420241242021

Abstract

Hierarchical low-rank approximation of dense matrices can reduce the complexity of their factorization from [Formula: see text] to [Formula: see text]. However, the complex structure of such hierarchical matrices makes them difficult to parallelize. The block size and ranks can vary between the sub-blocks, which creates load imbalance. The dependency between the sub-blocks during factorization results in serialization. Since many sub-blocks are low-rank, their small computational load exposes the overhead of runtime systems. The combination of these factors makes it challenging to implement these methods on GPUs. In this work, we show that dense matrices can be factorized with linear complexity, while extracting the potential parallelism of GPUs. This is made possible through the [Formula: see text]-ULV factorization, which removes the dependency on trailing sub-matrices.

Full Text