Parallel-architecture-directed program transformation

David Sharp,Martin Cripps,John Darlington

doi:10.1016/0167-8191(92)90126-r

Abstract

The widespread use of parallel machines has been hampered by the difficulty of mapping applications onto them effectively. The difficulty arises because current programming languages require the programmer to specify a problem to be solved at a low level of abstraction in an imperative form. Thus the programmer must immediately encode an architecture-specific algorithm detailing every communication and calculation. This process is prone to error and complicates the reuse of software. An alternative approach is to specify the problem to be solved at a high-level in a functional language. Meaning-preserving program transformations can then be used to derive a parallel algorithm. Such algorithms can be run on parallel graph-reduction or dataflow machines which automatically exploit the implicit parallelism in a functional language program. Such automatic decomposition techniques, however, are not yet capable of fully yielding the extra performance offered by the parallel hardware. We show how, by including an architecture specification with the problem specification, and extending the amount of transformation performed, it is possible to produce functional language code that explicity expresses the calculations and communications to be performed by the processors. This simplifies compilation, yields faster programs and enables parallel software to be developed for a wide variety of parallel computer architectures. A goal-seeking transformation methodology has been developed which enables a high-level functional specification of the problem and a high-level functional abstraction of the target computer architecture to be systematically manipulated to produce an efficient parallel algorithm tailored to the target architecture. As the transformations start from very high-level specifications, the discovery of new algorithms is facilitated. A case study is used to demonstrate the effectiveness of the technique. We show how a high level specification for sort can be transformed with a pipeline architecture specification to give a mergesort and how the same specification with a dynamic-message-passing architecture specification can be transformed to a novel parallel quicksort.

Full Text