Performance evaluation of self-timed dataflow on the iPSC-2 hypercube

Mayez Al-Mouhamed,Lubomir Bic

doi:10.1080/00207219508926135

Abstract

An approach for efficient execution of real-time computations (macro-dataflow) based on exploiting compile-time knowledge of computation and communication is proposed for the Intel iPSC-2 distributed-memory multiprocessor. The method is useful for synchronous dataflow computations that have predictable dataflow behaviour. To reduce the run-time overhead, it is proposed to generate self-timed processor assignments based on estimating the execution and communication times and the use of a simplified model of packet exchange between the processors. At run-time, each processor executes an ordered list of computation and communication functions so that only the start is left to be decided at run-time. To further reduce the run-time overhead, each processor runs a local manager program to handle the arrival and storage of remote data in an optimized manner. The execution method does not restrict the communication order and allows remote data be received during task execution. Experimental evaluation of the iPSC-2 distributed-memory hypercube shows that: self-timed execution allows significant reduction of run-time overhead; and the predicted and run-time speed-up match at the 10% level when the ratio of communication to computation times (granularity) is below 18% for the iPSC-2. Stochastic testing shows that gross estimation of the computation parameters (within 30% error) leads to moderate degradation of potential speed-up, and degradation decreases with increasing problem size.

Full Text