Abstract

Tiling is a well-known loop transformation to improve temporal locality of nested loops. Current compiler algorithms for tiling are limited to loops which are perfectly nested or can be transformed, in trivial ways, into a perfect nest. This paper presents a number of program transformations to enable tiling for a class of nontrivial imperfectly-nested loops such that cache locality is improved. We define a program model for such loops and develop compiler algorithms for their tiling. We propose to adopt odd-even variable duplication to break anti- and output dependences without unduly increasing the working-set size, and to adopt speculative execution to enable tiling of loops which may terminate prematurely due to, e.g. convergence tests in iterative algorithms. We have implemented these techniques in a research compiler, Panorama. Initial experiments with several benchmark programs are performed on SGI workstations based on MIPS R5K and R10K processors. Overall, the transformed programs run faster by 9% to 164%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call