Using profile information to assist advanced compiler optimization and scheduling

W Chen,J Gyllenhaal,R Bringmann,R Hank,N Warter,T Kiyohara,S Anik,D Lavery,W -M Hwu,S Mahlke

doi:10.1007/3-540-57502-2_38

Abstract

Compilers for superscalar and VLIW processors must expose sufficient instruction-level parallelism in order to achieve high performance. Compiletime code transformations which expose instruction-level parallelism typically take into account the constraints imposed by all execution scenarios in the program. However, there are additional opportunities to increase instructionlevel parallelism along the frequent execution scenario at the expense of the less frequent execution sequences. Profile information identifies these important execution sequences in a program. In this paper, two major categories of profile information are studied: control-flow and memory-dependence. Profile-based transformations have been incorporated into the IMPACT compiler. These transformations include global optimization, acyclic global scheduling, and software pipelining. The effectiveness of these profile-based techniques is evaluated for a range of superscalar and VLIW processors.

Full Text