The development of an overall building performance simulation model requires a multitude of input parameters which can be a challenging and resource-heavy task for building modellers. Furthermore, some parameters have little impact on a building’s overall performance and contribute little towards model prediction accuracy. Feature selection has been employed to identify the most influential input parameters to reduce complexity and computational time. However, previous studies focused mainly on identifying parameters that impact energy consumption in residential buildings, neglecting the important relationship between energy consumption and indoor environmental quality (IEQ). Therefore, this study proposes a novel simulation framework that integrates occupancy-based building archetypes, parametric simulation, and machine learning techniques to develop an overall building performance prediction model. Using this framework, the study generates a synthetic dataset of 40,000 simulations and performed embedded feature selection using two machine learning algorithms, Random Forest (RF) and Gradient Boosting Technique (GBT), to identify parameters that impact heating energy consumption, thermal discomfort hours, and CO2 concentration simultaneously. The results demonstrate that the ranking for importance and the number of required parameters vary depending on the target variable. Also, the set of parameters for combined analysis differs from individual target variable analysis. The GBT algorithm with embedded feature selection provides the most accurate prediction results with lower root mean square error (RMSE) and absolute error (AE) for individual and combined analyses. This study provides valuable insights for accurate parameter selection and analysis of overall building performance.
Read full abstract