Abstract
Data-driven fault diagnosis methods often require abundant labeled examples for each fault type. On the contrary, real-world data is often unlabeled and consists of mostly healthy observations and only few samples of faulty conditions. The lack of labels and fault samples imposes a significant challenge for existing data-driven fault diagnosis methods. In this paper, we aim to overcome this limitation by integrating expert knowledge with domain adaptation in a synthetic-to-real framework for unsupervised fault diagnosis. Motivated by the fact that domain experts often have a relatively good understanding on how different fault types affect healthy signals, in the first step of the proposed framework, a synthetic fault dataset is generated by augmenting real vibration samples of healthy bearings. This synthetic dataset integrates expert knowledge and encodes class information about the fault types. However, models trained solely based on the synthetic data often do not perform well because of the distinct distribution difference between the synthetically generated and real faults. To overcome this domain gap between the synthetic and real data, in the second step of the proposed framework, an imbalance-robust domain adaptation~(DA) approach is proposed to adapt the model from synthetic faults~(source) to the unlabeled real faults~(target) which suffer from severe class imbalance. The framework is evaluated on two unsupervised fault diagnosis cases for bearings, the CWRU laboratory dataset and a real-world wind-turbine dataset. Experimental results demonstrate that the generated faults are effective for encoding fault type information and the domain adaptation is robust against the different levels of class imbalance between faults.
Highlights
Data-driven fault diagnosis methods often require a large number of labeled data to generalize well
Motivated by the fact that domain experts often have a good understanding on how different fault types affect healthy signals, we propose to integrate expert knowledge in synthetic data with imbalance-robust domain adaptation for unsupervised fault diagnosis
We proposed a novel bearing fault diagnosis framework which can learn effective models from unlabeled real bearing data
Summary
Data-driven fault diagnosis methods often require a large number of labeled data to generalize well. Not all fault types may have been captured by the different assets These recordings are often unlabeled because precisely identifying when and which fault is emerging can be difficult even for experienced domain experts. The corresponding data-driven models can be trained based on these imitated faults This approach is different from previous works which learn solely from the synthetic data, because the model have an implicit or explicit access to information of the small real set of labeled faults. In these works, the unavoidable domain shift between the synthetic and real faults is still overlooked. Our work contributes to this by considering a mix of both the healthy and synthetic data which will result in a more realistic source domain
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have