Abstract

Compilers for superscalar and VLIW processors must expose sufficient instruction-level parallelism in order to achieve high performance. Compiletime code transformations which expose instruction-level parallelism typically take into account the constraints imposed by all execution scenarios in the program. However, there are additional opportunities to increase instructionlevel parallelism along the frequent execution scenario at the expense of the less frequent execution sequences. Profile information identifies these important execution sequences in a program. In this paper, two major categories of profile information are studied: control-flow and memory-dependence. Profile-based transformations have been incorporated into the IMPACT compiler. These transformations include global optimization, acyclic global scheduling, and software pipelining. The effectiveness of these profile-based techniques is evaluated for a range of superscalar and VLIW processors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call