Abstract

In addition to the performance, power consumption has become a major concern in High Performance Computing (HPC) systems. Typically the cooling system and the IT loads are the major contributors to the power bills. Understanding the power consumption at different granular levels in the HPC system is a first step to quantify the problems in HPC system. By proper monitoring and effective utilization of the cooling system, the power requirements of the HPC facility can be effectively met. In this paper we present a system where node level power measurement and WSN based rack level temperature measurement are used to provide localized control of cold air supply. A Smart Power Monitoring and Distribution Unit (PMDU) is designed and developed, to replace the existing Power Distribution Unit (PDU) in HPC. This can measure and report the power consumption, to support power profiling of large scale HPC system. This measured data are communicated to a base station via Ethernet. This base station collects all such measurements which can be used for power profiling of IT load of the HPC system. This helps to provide better insight into the power utilization pattern. Wireless Sensor Network (WSN) is used to collect exhaust and inlet air temperature of the server nodes and this information is used for directing the cold air effectively. A vent control system is designed and fabricated for intelligently directing the air flow to the server node inlet. It takes the node power and the temperature data as its inputs. This enables supplying/ redirecting more cold air towards under-cooled nodes without creating an extra load on the cooling system, thereby bringing in effective cooling at reduced power consumption. As an added advantage this could help in hot-spots mitigation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call