Abstract
With the exponential increase in cooling costs of large-scale data centers, thermal management must be adequately addressed. Recent trends have discovered one of the critical reasons behind the temperature rise turns out to be heat re-circulation within data center. In this study, we proposed a new resource- and thermal-aware scheduler in Hadoop clusters; our scheduler aims at minimizing peak inlet temperature across all nodes to reduce power consumption and cooling cost in data centers. The proposed dynamic scheduler makes job scheduling decisions based on current CPU/disk utilization and number of tasks running as well as the feedback given by all slave nodes at run-time. We deploy a thermal model to project respective temperature of each slave node in addition to neighbor’s heat contribution. The thermal-aware scheduler is integrated with the Hadoop’s scheduling mechanism. We test our schedulers by running a set of Hadoop benchmarks (e.g., WordCount, DistributedGrep, PI and TeraSort) under various temperature conditions, utilization thresholds, and cluster sizes. The experimental results show that our scheduler achieves an average inlet temperature reduction by 2.5 °C over the default FIFO scheduler; our scheduling solution saves approximately 15% of cooling cost with marginal performance degradation.
Accepted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have