The problem of unbalanced data classification has gotten extensive attention in the past few years. Unbalanced sample data makes the fault diagnosis and classification accuracy rate low, and the capability to classify minority-class fault samples is restricted. To address the problem that the classification algorithm in machine learning has the insufficient capability to identify minority class samples for unbalanced sample data classification problems. Therefore, this paper proposes an improved support vector machine (SVM) classification method based on the synthetic minority over-sampling technique (SMOTE). For the sampler, an improved synthetic minority over-sampling technique based on the characteristics of neighborhood distribution (CND-SMOTE) algorithm is used to equilibrate the minority class samples and the majority class samples. For the classifier, the parameter optimization method of support vector machines based on the bat algorithm (BA-SVM) is used to solve the multi-classification problem of faulty samples. Finally, experimental results prove that the CND-SMOTE+BA-SVM algorithm can synthesize high-quality minority fault samples, increase the classification accuracy rate of fault samples, and decrease the time spent on the classification.
Read full abstract