Abstract
Increasing advancements in building digitization, smart sensing, and metering technologies have allowed large amounts of timeseries data to be collected for monitoring, analyzing, and controlling building systems. However, due to sensor or communication failures, the data collected are often incomplete and poor in quality. Data imputation approaches to replace the missing values, specifically based on either statistical or predictive models have been widely adopted for multivariate datasets in other domains. It is hence of interest to find an effective way to impute timeseries data collected from a building system. In this paper, we evaluate multiple data imputation approaches using data collected from a medium sized building situated in Stockholm, Sweden and a small commercial building from the ASHRAE RP-1312 research project. Sensors with widely varying characteristics from the case study buildings were selected to evaluate the imputation methods. The imputation accuracy and the impact of each chosen imputation method on information entropy, short-term building forecasting model performance, and fault detection strategy were evaluated. Results demonstrate that incorporating time-lagged cross correlations within a k -nearest neighbor ( k NN) model provide the most accurate imputations without affecting the quality of subsequent data analysis.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have