Abstract

In this paper, we present a first solution to the unsolved problem of joint tiling and scheduling a given loop nest with uniform data dependencies symbolically. This problem arises for loop programs for which the iterations shall be optimally scheduled on a processor array of unknown size at compile-time. Still, we show that it is possible to derive parameterized latencyoptimal schedules statically by proposing two new program transformations: In the first step, the iteration space is tiled symbolically into orthotopes of parametrized extensions. The resulting tiled program is subsequently scheduled symbolically. Here, we show that the maximal number of potential optimal schedules is upper bounded by 2nn! where n is the dimension of the loop nest. However, the real number of optimal schedule candidates being much less than this. At run-time, once the size of the processor array becomes known, simple comparisons of latency-determining expressions finally steer which of these schedules will be dynamically activated and the corresponding program configuration executed on the resulting processor array so to avoid any further run-time optimization or expensive recompilations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.