A new formulation for LU decomposition allows efficient representation of intermediate matrices while eliminating blocks of various sizes, i.e. during “undulant-block” elimination. Its efficiency arises from its design for block encapsulization, implicit in data structures that are convenient both for process scheduling and for memory management. Row/column permutations that can destroy such encapsulizations are deferred. Its algorithms, expressed naturally as functional programs, are well suited to parallel and distributed processing. A given matrix A is decomposed into two matrices (in the space of just one), plus two permutations. The permutations, P and Q, are the row/column rearrangements usual to complete pivoting. The principal results are L and U′, where L is properly lower quasi-triangular; U′ is upper quasi-triangular with its quasi-diagonal being the inverse of that of U from the usual factorization ( PAQ = ( I − L) U), and its proper upper portion identical to U. The matrix result is L + U′. Algorithms for solving linear systems and matrix inversion follow directly. An example of a motivating data structure, the quadtree representation for matrices, is reviewed. Candidate pivots for Gaussian elimination under that structure are the subtrees, both constraining and assisting the pivot search, as well as decomposing to independent block/tree operations. The elementary algorithms are provided, coded in Haskell. Finally, an integer-preserving version is presented replacing Bareiss's algorithm with a parallel equivalent. The decomposition of an integer matrix A to integer matrices L ̄ , U ̄ ′ , and d = det A follows L + U′ decomposition, but the follow-on algorithm to compute dA −1 is complicated by the requirement to maintain minimal denominators at every step and to avoid divisions, restricting them to necessarily exact ones.
Read full abstract