To obtain excellent classification performance for fault diagnosis, most intelligent fault diagnosis methods based on deep learning require massive labeled samples for training. However, collecting sufficient labeled fault samples is very difficult in practice due to the time-consuming and laborious work, which means the actual available dataset is the unbalanced dataset, i.e., normal data is the vast majority, while the fault samples are very small. To address this problem, a modified active learning intelligent fault diagnosis method is proposed for rolling bearings with unbalanced samples. The proposed method can adeptly employ a limited number of labeled samples to intelligently label the unlabeled samples. Therefore, the proposed method can improve classification performance while simultaneously minimizing the requisite amount of labeled samples during training. First, time and time–frequency features of vibration signals are extracted to obtain their distribution in the feature space. Second, to solve the problem of sample class unbalance, a Gaussian mixture model is constructed to obtain the distribution representation of the samples. The random undersampling method was used in Gaussian sub-model, which can extract some samples from majority classes. These extracted samples have similar distribution to the original sample set, and hence can represent the original dataset and be used to establish balanced labeled sample set. Third, an initial active learning classifier based on density peak clustering is established, utilizing the representative examples to intelligently label the unlabeled samples. To optimize the utilization of unlabeled samples, batch process method is adopted to update the initial classifier. The effectiveness of the proposed method is verified by two rolling bearings fault simulation experiments. The results show that our method can effectively improve fault diagnosis accuracy with unbalanced samples, and the updated classifier needs fewer training data to achieve comparable diagnostic performance.
Read full abstract