Abstract

Machine learning is considered a promising method for developing building energy-benchmarking models. However, the high dimensionality of building energy datasets can reduce model accuracy and generalization but increase the computational cost. Meanwhile, the poor interpretability of machine learning models limits the understanding of insights and, in turn, hinders policymaking. Therefore, the first objective of this study was to investigate the benefits of feature selection on the performance of machine learning-based energy usage models. Three typical feature selection methods (filter, wrapper, and embedded) were selected, and the effect of each method was evaluated based on three tree-ensemble learning algorithms. Another objective was to analyze the interpretability of the machine learning model using the Shapley additive explanation method. The results were obtained using a city-scale energy consumption dataset consisting of 478 healthcare buildings in Chongqing, China. It was found that the wrapper method generally improved the accuracy of the machine learning models compared to that of the other two methods. In addition, the model developed using extreme gradient boosting combined with the wrapper method achieved the best accuracy. Moreover, the model interpretability analysis demonstrated important features and revealed how these features influence energy use for individual buildings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call