Abstract

Describes the design of ScaLAPACK, a scalable software library for performing dense and banded linear algebra computations on distributed memory concurrent computers. The specification of the data distribution has important consequences for interprocessor communication and load balance, and hence is a major factor in determining performance and scalability of the library routines. The block cyclic data distribution is adopted as a simple, yet general purpose, way of decomposing block-partitioned matrices. Distributed memory versions of the Level 3 BLAS provide an easy and convenient way of implementing the ScaLAPACK routines. >

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call