Optimizing performance and energy across problem sizes through a search space exploration and machine learning

Lana Scravaglieri,Mihail Popov,Laércio Lima Pilla,Amina Guermouche,Olivier Aumage,Emmanuelle Saillard

doi:10.1016/j.jpdc.2023.104720

Abstract

HPC systems expose configuration options to assist optimization. Configurations such as parallelism, thread and data mapping, or prefetching have been explored but with a limited optimization objective (e.g., performance) and a fixed problem size. Unfortunately, efficient strategies in one scenario may poorly generalize when applied in new contexts.We investigate the impact of configuration options and different problem sizes over performance and energy. Well-adapted NUMA-related options and cache-prefetchers provide significantly more gains for energy (5.9×) than performance (1.85×). Moreover, reusing optimization strategies from performance to energy only provides 40% of the gains found when natively optimizing for energy, while transferring strategies across problem sizes limits to 70% of the original gains.We fill this gap with ML: simple decision trees predict the best configuration for a target size using only information collected on another size. Our models achieve 88% of the native gains when cross-predicting across performance and energy, and 85% across problem sizes.

Full Text