Parallel solution of arbitrarily sparse linear systems

Robert A Wagner

doi:10.1016/0167-8191(89)90115-4

Abstract

A parallel algorithm for the iterative solution of sparse linear systems is presented. This algorithm is shown to be efficient for arbitrarily sparse matrices. Analysis of this algorithm suggests that a network of Processing Elements [PE's] equal in number to the number R of non-zero matrix entries is particularly useful. If this collection of PE's is interconnected by a message-passing, or a synchronous, communication network which is fast enough, the iteration time grows as the logarithm of the number of PE's. A comparison with earlier work, which suggested that only √ R PE's are useful for this task, is also presented. The performance of three proposed networks of PE's on this algorithm is analyzed. The networks investigated all have the topology of the Cube Connected Cycles [CCC] graph, and all employ the same silicon technology, and the same number of chips and wires, and hence all should cost the same. One, the Boolean Vector Machine [BVM], employs 2 20 bit-serial PE's implemented in 4096 VLSI chips; the other two networks use different 32-bit parallel microprocessors, and a 32-bit parallel CCC to interconnect 2048 2-chip processors. One of the microprocessors is assumed to deliver about 1 Mflop, while the other is assumed to deliver 32 Mflops per PE. The comparison indicates that the BVM network would have superior performance to both of these parallel networks.

Full Text