Symbolic parallelization of loop programs for massively parallel processor arrays

Jurgen Teich,Alexandru Tanase,Frank Hannig

doi:10.1109/asap.2013.6567543

Abstract

In this paper, we present a first solution to the unsolved problem of joint tiling and scheduling a given loop nest with uniform data dependencies symbolically. This problem arises for loop programs for which the iterations shall be optimally scheduled on a processor array of unknown size at compile-time. Still, we show that it is possible to derive parameterized latencyoptimal schedules statically by proposing two new program transformations: In the first step, the iteration space is tiled symbolically into orthotopes of parametrized extensions. The resulting tiled program is subsequently scheduled symbolically. Here, we show that the maximal number of potential optimal schedules is upper bounded by 2nn! where n is the dimension of the loop nest. However, the real number of optimal schedule candidates being much less than this. At run-time, once the size of the processor array becomes known, simple comparisons of latency-determining expressions finally steer which of these schedules will be dynamically activated and the corresponding program configuration executed on the resulting processor array so to avoid any further run-time optimization or expensive recompilations.

Full Text