Abstract

We investigate the effect of load balancing when performing Cholesky factorization on a massively parallel SIMD computer. In particular we describe a supernodal algorithm for performing sparse Cholesky factorization. The way the matrix is mapped onto the processors has significant effect on its efficiency. We show that this assignment problem can be modeled as a graph coloring problem in a weighted graph. By a simple greedy algorithm, we obtain substantial speedup compared with previously suggested data mapping schemes. Experimental runs have been made on a 16K processor MasPar MP-2 parallel computer using symmetric test matrices with irregular sparsity structure. On these problems our implementation achieves performance rates of well above 200 Mflops in double precision arithmetic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call