Abstract

A parallel algorithm for the iterative solution of sparse linear systems is presented. This algorithm is shown to be efficient for arbitrarily sparse matrices. Analysis of this algorithm suggests that a network of Processing Elements [PE's] equal in number to the number R of non-zero matrix entries is particularly useful. If this collection of PE's is interconnected by a message-passing, or a synchronous, communication network which is fast enough, the iteration time grows as the logarithm of the number of PE's. A comparison with earlier work, which suggested that only √ R PE's are useful for this task, is also presented. The performance of three proposed networks of PE's on this algorithm is analyzed. The networks investigated all have the topology of the Cube Connected Cycles [CCC] graph, and all employ the same silicon technology, and the same number of chips and wires, and hence all should cost the same. One, the Boolean Vector Machine [BVM], employs 2 20 bit-serial PE's implemented in 4096 VLSI chips; the other two networks use different 32-bit parallel microprocessors, and a 32-bit parallel CCC to interconnect 2048 2-chip processors. One of the microprocessors is assumed to deliver about 1 Mflop, while the other is assumed to deliver 32 Mflops per PE. The comparison indicates that the BVM network would have superior performance to both of these parallel networks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.