Abstract

The performance gap for high performance applications has been widening over time. High level program transformations are critical to improve applications’ performance, many of which concern the determination of optimal values for transformation parameters, such as loop unrolling and blocking. Static approaches achieve these values based on analytical models that are hard to achieve because of increasing architecture complexity and code structures. Recent iterative compilation approaches achieve it by executing different versions of the program on actual platforms and select the one that renders best performance, outperforming static compilation approaches significantly. But the expensive compilation cost has limited their application scope to embedded applications and a small group of math kernels. This paper proposes a combinative approach--Combining Model and Iterative Compilation for Program Performance Optimization (CMIC). Such an approach first constructs a program optimization transformation model based on hardware performance counters to decide how and when to apply transformations, and then selects the optimal transformation parameters using Nelder-Mead simplex algorithm. Experimental results show that our approach can effectively improve programs’ floating-point performance, reducing programs’ runtime, therefore, lessening the performance gap for high-performance applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call