Abstract

Some loops with cross-iteration dependences can execute in parallel by pipelining. The loop body is partitioned into stages such that the data dependences are not violated and then the stages are mapped onto threads. Two well-known mapping techniques are fixed code and fixed data ; they achieve high performance for load-balanced loops, but they fail to perform well for load-imbalanced loops. In this article, we present a novel hybrid mapping that eliminates drawbacks of both prior mapping techniques and enables dynamic scheduling of stages.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call