Abstract

Building energy systems work under wide-scale operation conditions. The available data from some conditions might be far less than the data from the other conditions seriously. This is the so-called data imbalance problem, that is, the volumes of data are different for various conditions. This problem is always ignored in the field of building energy load prediction. Three questions remain unclear: how to identify various building operation conditions, how this problem affects the prediction accuracy, and how to overcome this problem. With the aim of addressing the above three questions, at first, this study proposes a clustering decision tree algorithm to identify the building operation conditions. Then, the effects of data imbalance are investigated by changing the proportions of model training samples from various operation conditions. Finally, a clustering decision tree-based multi-model prediction method is proposed to solve the data imbalance problem. The one-year historical operational data from a public building are utilized to validate the multi-model method. The results show that the proposed method has better prediction performance than the conventional single model-based method. It decreases the mean absolute errors of energy load prediction using artificial neural networks, gradient boosting trees, random forests, and support vector regression by 9.83%, 6.71%, 1.32%, and 12.22% on average, respectively. In addition, it increases the coefficients of determination of energy load prediction using the four algorithms by 8.47%, 4.59%, 0.26%, and 13.99% on average, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call