Abstract

For imbalanced classification, data-level methods can achieve inter-class balance, but the samples generated do not contain new information and cannot avoid the problem of introducing noise. Algorithm-level methods may lead to overfitting of the model, and its classification effect is more dependent on the specific dataset and classification task, which means they lack universality. In addition, how to deeply mine the differences in the distribution of data overlap areas, and how to effectively mine the differences between categories when the absolute number of minority samples is small, are also important challenges in imbalanced classification. This paper proposes an imbalanced binary classification method using multi-label confidence comparisons based on contrastive learning. Different from the previous idea of directly learning its distribution characteristics from minority samples, combined with the idea of contrastive learning, the classification task is redefined as the multi-label matching task by mining the deep features that can represent the commonality and difference between the neighboring samples. Multiple differentiated contrastive sample groups are obtained through random sampling in its neighbor sample pool for each sample. This sample is combined with its contrastive sample groups to form multiple sample-neighbor pairs as training samples in the multi-label matching task. The original dataset is multiplied without introducing noise, laying a foundation for the effective mining of class differences when the absolute number of minority class samples is small. Based on the corresponding reconstruction error generated by Variational AutoEncoder (VAE), for sample-neighbor pairs, a multi-label matching loss between target samples and contrastive sample groups that integrates the idea of contrastive learning is designed. a robust classifier is obtained through simultaneous iterative learning of reconstruction error and multi-label matching loss, which can better mine the distribution differences of overlapping regions. In the testing phase, multiple different contrastive sample groups and the corresponding prediction results of the samples to be classified are obtained, which categories can be judged by integrating the predictions of each group for reverse reasoning. Experimental results on 38 public datasets show that the method outperforms typical imbalanced classification methods in both F1-measure and G-mean.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call