Compiler optimization is crucial in improving program performance by improving execution speed, reducing memory usage, and minimizing energy consumption. Nevertheless, modern compilers, such as LLVM, with their numerous optimization passes, present a significant challenge in identifying the most effective sequence for optimizing a program. This study addresses the complex problem of determining optimal compiler optimization sequences within the LLVM framework, which encompasses 64 optimization passes, causing in an immense search space of 264264. Identifying the ideal sequence for even simple code can be an arduous task, as the interactions between passes are intricate and unpredictable. The primary objective of this research is to utilize machine-learning techniques to predict effective optimization sequences that outperform the default -O2 and -O3 optimization flags. The methodology involves generating 2,000 sequences per program and picking the one that achieves the shortest execution time. Three machine learning models—K-Nearest Neighbor (KNN), Decision Tree (DT), and Feedforward Neural Network (FFNN)—were employed to predict the optimization sequences based on features extracted from programs during execution. The study used benchmarks from Polybench, Shootout, and Stanford suites, each with varying problem sizes, to validate the proposed technique. The results demonstrate that the KNN model produced optimization sequences with superior performance compared to DT and FFNN. On average, KNN achieved execution times that were 2.5 times faster than those achieved using the O3 optimization flag. This research contributes to the field by programming the process of selecting optimal compiler sequences, which significantly reduces execution time and eliminates the need for manual tuning. It highlights the potential of machine learning in compiler optimization, offering a robust and scalable approach to improving program performance and setting the foundation for future advancements in the domain.
Read full abstract