Abstract
In imbalanced learning, oversampling is incredibly prevalent. However, it is disappointing that existing oversampling methods have their own limitations, such as new synthetic samples may be uninformative or noisy. To better address imbalanced learning tasks, this paper proposes a novel over-sampling method, named Density-induced Selection Probability-based Oversampling TEchnique (DSPOTE). To increase the number of samples in the minority class, DSPOTE designs a novel scheme for filtering noisy samples based on the Chebychev distance and a new way of calculating selection probability based on relative density. DSPOTE first filters noisy samples and then gets borderline ones from the minority class. Next, DSPOTE calculates the selection probabilities for all borderline samples and applies these probabilities to pick up borderline samples. Finally, DSPOTE generates synthetic samples for the minority class based on the selected borderline ones. Experimental results indicate that our method has good performance in terms of metrics, recall and AUC (Area Under the Curve), when compared with other eight methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.