Abstract

The energy efficiency of Data Center (DC) operations heavily relies on a DC ambient temperature as well as its IT and cooling systems performance. A reliable and efficient cooling system is necessary to produce a persistent flow of cold air to cool servers that are subjected to constantly increasing computational load due to the advent of smart cloud-based applications. Consequently, the increased demand for computing power will inadvertently increase server waste heat creation in data centers. To improve a DC thermal profile which could undeniably influence energy efficiency and reliability of IT equipment, it is imperative to explore the thermal characteristics analysis of an IT room. This work encompasses the employment of an unsupervised machine learning technique for uncovering weaknesses of a DC cooling system based on real DC monitoring thermal data. The findings of the analysis result in the identification of areas for thermal management and cooling improvement that further feeds into DC recommendations. With the aim to identify overheated zones in a DC IT room and corresponding servers, we applied analyzed thermal characteristics of the IT room. Experimental dataset includes measurements of ambient air temperature in the hot aisle of the IT room in ENEA Portici research center hosting the CRESCO6 computing cluster. We use machine learning clustering techniques to identify overheated locations and categorize computing nodes based on surrounding air temperature ranges abstracted from the data. This work employs the principles and approaches replicable for the analysis of thermal characteristics of any DC, thereby fostering transferability. This paper demonstrates how best practices and guidelines could be applied for thermal analysis and profiling of a commercial DC based on real thermal monitoring data.

Highlights

  • Considerable efforts have been made by Data Centers in terms of their energy efficiency, reliability and sustainable operation over the past decade

  • Existing Data Center (DC)-related thermal management research highlights the primary challenges of cooling systems in high power density DCs [14]; recommends a list of thermal management strategies based on energy consumption awareness [2,15]; explores the effect of different cooling approaches on power usage effectiveness (PUE) using direct air with a spray system that evaporates water to cool and humidify incoming air [16]; investigates the thermal performance of air-cooled data centers with raised and non-raised floor configurations [17]; studies various thermofluid mechanisms using cooling performance metrics [18]; proposes thermal models for joint cooling and workload management [19], while other strains of research explore thermal-aware job scheduling, dynamic resource provisioning, and cooling [20]

  • Analysis of information technology (IT) and cooling systems is necessary for the investigation of DC operations-related energy efficiency

Read more

Summary

Introduction

Considerable efforts have been made by Data Centers in terms of their energy efficiency, reliability and sustainable operation over the past decade. If a data center experiences a system failure or outage, it becomes challenging to ensure a stable and continuous provision of IT services, for smart businesses, social media, etc If such a situation occurs on a large scale, it could be detrimental to the businesses and public sectors that rely on DC services, for example, health systems, manufacturing, entertainment, etc. A number of theoretical and practical studies have been conducted on DC thermal management to better understand ways to mitigate inefficiencies of the cooling systems. A majority of previously listed research work focuses on simulations or numerical modeling [2,16,17,18,19,20] as well as on empirical studies involving R&D or small-scale data centers [16,21]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call