Energy Efficient Job Co-scheduling for High-Performance Parallel Computing Clusters

David K Newsom,Olivier Serres,Abdel-Hameed A Badawy,Tarek El-Ghazawi,Sardar F Azari

doi:10.1109/smartcity.2015.127

Abstract

Cost and environmental concerns continue to drive research in high performance computing (HPC) energy optimization. Commodity server platforms are increasingly deployed as compute clusters which have a variety of energy management control features. In this paper, we examine the energy reduction effect of different ways to co-schedule benchmark codes on a HPC cluster using different combinations of job queue control dimensions including, thread core affinity interleaving, Dynamic Voltage and Frequency Scaling (DVFS), and job re-ordering. The combination space of control parameters in conjunction with varying job queue depths is too large to explore using a direct measurement approach so we developed a scheduling simulator that can quickly and efficiently examine a large permutation space of job-spans to find the energy optimal order and control configuration. Equipped with the base time/energy profiles of the benchmark algorithms, the simulator can reliably predict the execution time and energy of all the job queue permutation (ordering) choices, including the optimal control parameter combinations within a 3% margin of error.

Full Text