Abstract

In class-imbalance learning, Synthetic Minority Oversampling Technique (SMOTE) is a widely used technique to tackle class-imbalance problems from the data level, whereas SMOTE blindly selects neighboring minority class points when performing an interpolation among them and inevitably brings collinearity between the generated new points and the original ones. To combat these problems, we propose in this study an adaptive-weighting SMOTE method, termed as AWSMOTE. AWSMOTE applies two types of SVM-based weights into SMOTE. A kind of weight is used in variable space to combat the drawbacks of collinearity, while another weight is utilized in sample space to purposefully choose those support vectors from the minority class as the neighboring points in the interpolation. AWSMOTE is compared with SMOTE and its improved versions with six simulated datasets and 22 real-world datasets. The results demonstrate the effectiveness and advantages of the proposed approach.

Highlights

  • Scientific Programming for datasets with only positive samples

  • In this paper, inspired by Synthetic Minority Oversampling Technique (SMOTE) and SVM, a new effective oversampling method, called Adaptive-Weighting SMOTE (AWSMOTE), is proposed to deal with imbalanced learning. It divides minority samples into support vectors and nonsupport vectors according to SVM and assigns different weights to samples. en, AWSMOTE makes each minority support vector generate the same number of new samples and uses SVM to predict these samples. e additional weight of each support vector is determined according to the accuracy of new sample prediction, namely, the more correctly predicted the new samples are, the greater the weight of the minority sample will be

  • The majority class is constructed by grouping all the labels except the minority class, and the use of multiclass datasets is compelling that only a small part of the binary dataset available. e relevant information is presented in Table 3 for each dataset

Read more

Summary

Introduction

Scientific Programming for datasets with only positive samples. It attempts to find patterns from the dataset and effectively separates the positive samples from the potentially negative samples from the larger hypothesis space. In this paper, inspired by SMOTE and SVM, a new effective oversampling method, called AWSMOTE, is proposed to deal with imbalanced learning. En, AWSMOTE makes each minority support vector generate the same number of new samples and uses SVM to predict these samples. Different from the general adaptive weighted methods, AWSMOTE can judge which minority sample is more suitable to generate more new samples to improve the classification effect, and it combines with SVM to highlight the role of support vector. We introduce the weight of variable through the estimate vector in SVM, which can weaken the collinearity brought by SMOTE and make the generated samples more consistent with the characteristics of the classifier. (1) AWSMOTE considers and combines the weight of sample and variable in the process of oversampling, which focuses on the distributional characteristics and improves the classification accuracy of the data. (3) AWSMOTE generates different numbers of new samples adaptively by using the weight of each minority sample

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call