Abstract

Supernode transformation has been proposed to reduce the communication startup cost by grouping a number of iterations in a loop as a supernode which is assigned to a processor as a single unit. A supernode transformation is specified by n families of hyperplanes which slice the iteration space into parallelepiped supernodes, the grain size of a supernode, and the relative side lengths of the parallelepiped supernode. The total running time is affected by the three factors mentioned above. In this paper, how to find an optimal grain size and an optimal relative side length vector, with the goal of minimizing total running time, is addressed. Two communication cost models are considered. In the first one, communication cost is approximated by a constant startup penalty and in the second, communication cost is a function of the startup penalty and the message size. We derive closed form analytical expressions for the optimal supernode size for the one parameter model and the two parameter model with doubly nested loops. A closed form expression for the optimal relative length vector is also provided for the one parameter model with constant bounded loop iteration space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call