Abstract

This paper presents parallel algorithms for matrix–matrix multiplication which are built up from several algorithms in a multi-level structure. The upper level consists of Strassen’s algorithm which is performed for a predefined number of recursions. The number of recursions can be adapted to the specific execution platform. The intermediate level is performed by a parallel non-hierarchical algorithm and the lower level uses efficient one-processor implementations of matrix–matrix multiplication like BLAS or ATLAS. Both the number of recursions of Strassen’s algorithm and the specific algorithms of the intermediate and lower level can be chosen so that a variety of different multi-level algorithms results. Each level of the multi-level algorithms is associated with a hierarchical partition of the set of available processors into disjoint subsets so that deeper levels of the algorithm employ smaller groups of processors in parallel. The algorithms are expressed in the multiprocessor task programming model and are coded with the runtime library Tlib. Performance experiments on several parallel platforms show that the multi-level algorithms can lead to significant performance gains.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.