The growing demand for cloud computing and storage necessitates the expansion of Data Centers, thereby increasing energy consumption and environmental footprint. The data-driven dynamic thermal control presents a promising solution to mitigate this issue, as it can offer improved thermal control strategies and address the intricate and stochastic nature of the thermal environment. Nonetheless, existing solutions typically exhibit deficiencies in terms of reliability and practicability. To this end, this paper proposes a feasible uncertainty-aware online learning method with reliable dynamics models and controllers. Specifically, Deep Learning models with the capacity for uncertainty quantification are employed to account for the imperfections in pre-trained models, hence providing reliable predictions. The Cross-Entropy Method and Monte Carlo trajectory sampling solve optimization problems while considering uncertainties. It is mathematically analyzed that the uncertainty quantified by the system dynamics model can be captured with sufficient trajectory samples, rendering a reliable controller. A practical framework is proposed and tested on a widely used Data Center model to mitigate the insufficient capacity of Deep Learning models to quantify uncertainties. The results demonstrate its superior performance during the initial training stage, achieving a 66% reduction in violations and 3.76% power savings compared to the default controller. Continuous active online learning enhances its adaptability, resulting in an 80% reduction in violations and a 6% energy saving. The thermal control strategies are thus improved by the proposed new solutions and insights.
Read full abstract