Abstract

Data-driven methods have drawn increasing interests in HVAC fault diagnosis tasks due to their intrinsic advantages in making real-time automated decisions. To ensure the reliability of data-driven models, it is essential to prepare sufficient labeled data for predictive modeling. In practice, it can be very time-consuming and labor-intensive to determine the actual operating condition or label of each data sample (e.g., Normal or Faulty), making it highly challenging to develop robust data-driven solutions through conventional supervised learning methods. To tackle such challenges, this study proposes a data analytic framework to integrate active learning and semi-supervised learning to utilize massive unlabeled data for improved fault diagnosis performance. More specifically, five active learning methods have been tested to quantify their effectiveness in discovering valuable unlabeled data for expert labeling. Semi-supervised data-driven models have been developed to enable autonomous knowledge discovery from unlabeled building operational data through self-training protocols. Data experiments have been conducted to explore the separated and integrated values of active and semi-supervised learning. The results show that active learning can effectively identify valuable data samples for fault diagnosis and thereby, reducing approximately 50% labeling costs. Cost-effective combinatorial strategies have been derived to integrate active learning and semi-supervised learning for practical applications. The research outcomes are valuable for developing advanced data-driven solutions with substantial decreases in manual costs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call