A Framework for Intelligent Speculative Compiler Optimizations and its Application to Memory Accesses

Lars Alvincz

doi:10.14279/depositonce-2255

Abstract

In this thesis, we present a conceptual Framework for Intelligent Speculative Compiler Optimizations (FrISCO) and its application to the optimization of memory accesses. The framework aims at providing compilers with knowledge about the run-time behavior of programs to bridge the gap between static program analyses on the one hand and dynamic program behavior on the other. This solves the problem of over-approximation, which is inherent to static program analyses, and increases the optimization potential. We use machine learning to make the knowledge available to the compiler. The principal idea of our framework is to admit unsafe, yet more precise program analyses within the compiler and to use their results in speculative optimizations, which use the information to derive precise cost models and which guarantee program correctness in case of misspeculation. In our approach, we use heuristics to predict the dynamic program behavior. We present a method to generate such heuristics automatically in a one-off training phase from profiling data using machine learning. Additionally, we propose to perform program classification to group programs with similar behavior together, which can be done automatically via cluster analysis. Based on the clustering, we train one specialized heuristics for each class as well as a program class predictor. With that, we can precisely predict the behavior of arbitrary programs by selecting the most appropriate heuristics. The obtained overall heuristics is highly scalable and can be automatically translated to executable code to be integrated within compiler optimizations. We present a general optimization algorithm, onto which most existing optimizations can be mapped. The algorithm iteratively transforms the program. To ensure that the best transformation is found in each step, the algorithm uses a cost model that is evaluated with the help of the heuristics. The conceptual framework is applicable to a wide range of program behavior and program optimizations. In the second part of this thesis, we show the application of the framework to the optimization of memory accesses, which is a highly important optimization problem due to the memory gap. For the applied framework, we present a novel optimization algorithm that performs speculative code motion to reduce the effective latency of load instructions. During code motion, the algorithm overcomes memory dependencies, register dependencies, and control dependencies, and it maintains a precise cost model which captures the effect of each transformation on the latency of the optimized load. The cost model relies on information about the memory behavior of a program, namely the probability of memory dependencies and load latencies. We present how to build heuristics for that via machine learning. We fully implemented the instantiated framework. As target architecture, we chose the Intel Itanium2 processor, a modern VLIW processor with hardware support for speculation. In our experiments, we could first show that the heuristics predict the memory behavior precisely, especially due to our concept of program classification. Second, our run-time experiments demonstrate that our speculative optimization, with the help of the heuristics, significantly improves program performance and avoids performance degradation due to the cost model.

Full Text