Towards Structured Parallel Computing on Architecture-Independent Parallel Algorithm Design for Distributed-Memory Architectures

Feng Gao

doi:10.1006/jcss.1996.0053

Abstract

This paper introduces an architecture-independent, hierarchical approach to algorithm design on distributed-memory architectures, in contrast to the current trend of tailoring algorithms towards specific architectures. We show that, rather surprisingly, this new approach can achieve uniformity without sacrificing efficiency. In our framework, there are three levels of algorithm design: design of a network-independent algorithm in a network-independent programming environment, design of virtual networks (virtual architectures) for the algorithm, and design of emulations of the virtual networks on physical networks. In its organizational principle, this methodology is analogous to the abstract data structure approach to sequential algorithm design. We propose the following thesis: architecture-independent optimality can lead to portable optimality. Namely, a single network-independent algorithm, when optimized network-independently, with the support of properly chosen virtual networks, can be implemented on a wide spectrum of physical networks to achieve optimality on each of them with respect to both computation and communication. We illustrate this thesis with an analysis of the example of algorithm design for ordinary matrix multiplication. In a paper by Gao, a general theory of portable optimality of parallel algorithms is presented. Besides its implications to the methodology of parallel algorithm design, our framework also suggests new questions for theoretical research in parallel computation on interconnection networks.

Full Text