Abstract
Datacenters have become an important part of today’s computing infrastructure. Recent studies have shown the increasing importance of thermal considerations to achieve effective resource management. In this paper, we study thermal-aware scheduling for homogeneous high-performance computing (HPC) datacenters under a thermal model that captures both spatial and temporal correlations of the temperature evolution. We propose an online scheduling heuristic to minimize the makespan for a set of HPC applications subject to a thermal constraint. The heuristic leverages the novel notion of thermal-aware load to perform both job assignment and thermal management. To respect the temperature constraint, which is governed by a complex spatio-temporal thermal correlation, dynamic voltage and frequency scaling (DVFS) is used to regulate the job executions during runtime while dynamically balancing the loads of the servers to improve makespan. Extensive simulations are conducted based on an experimentally validated datacenter configuration and realistic parameter settings. The results show improved performance of the proposed heuristic compared to existing solutions in the literature, and demonstrate the importance of both spatial and temporal considerations. In contrast to some scheduling problems, where DVFS introduces performance–energy tradeoffs, our findings reveal the benefit of applying DVFS with both performance and energy gains in the context of spatio-temporal thermal-aware scheduling.
Submitted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have