In the health monitoring of electromechanical transmission systems, the collected state data typically consist of only a minimal amount of labeled data, with a vast majority remaining unlabeled. Consequently, deep learning-based diagnostic models encounter the challenge of scarcity in labeled data and abundance in unlabeled data. Traditional semi-supervised deep learning methods based on pseudo-label self-training, while alleviating the issue of labeled data scarcity to some extent, neglect the reliability of pseudo-label information, the accuracy of feature extraction from unlabeled data, and the imbalance in sample selection. To address these issues, this paper proposes a novel semi-supervised fault diagnosis method under imbalanced unlabeled sample class information screening. Firstly, an information screening mechanism for unlabeled data based on active learning is established. This mechanism discriminates based on the variability of intrinsic feature information in fault samples, accurately screening out unlabeled samples located near decision boundaries that are difficult to separate clearly. Then, combining the maximum membership degree of these unlabeled data in the classification space of the supervised model and interacting with the active learning expert system, label information is assigned to the screened unlabeled data. Secondly, a cost-sensitive function driven by data imbalance is constructed to address the class imbalance problem in unlabeled sample screening, adaptively adjusting the weights of different class samples during model training to guide the training of the supervised model. Ultimately, through dynamic optimization of the supervised model and the feature extraction capability of unlabeled samples, the recognition ability of the diagnostic model for unlabeled samples is significantly enhanced. Validation through two datasets, encompassing a total of 12 experimental scenarios, demonstrates that in scenarios with only a small amount of labeled data, the proposed method achieves a diagnostic accuracy increment exceeding 10% compared to existing typical methods, fully validating the effectiveness and superiority of the proposed method in practical applications.
Read full abstract