Abstract
Machine learning models have gained popularity nowadays for their potential to solve real-life issues when trained on pertinent data. In many cases, the real-life data are class imbalanced and hence the corresponding machine learning models trained on the data tend to perform poorly on metrics like precision, recall, AUC, F1, and G-mean score. Since class imbalance issue poses serious challenges to the performance of trained models, a multitude of research works have addressed this issue. Two common data-based sampling techniques have mostly been proposed-undersampling the data of the majority class and oversampling the data of the minority class. In this article, we focus on the former approach. We propose two novel algorithms that employ neural network-based approaches to remove majority samples that are found to reside in the vicinity of the minority samples, thereby undersampling the former to remove (or alleviate) the imbalance issue. We delineate the proposed algorithms and then test the proposed algorithms on some publicly available imbalanced datasets. We then compare the performance of our proposed algorithms to other popular undersampling algorithms. Finally, we conclude that our proposed algorithms outperform most of the existing undersampling approaches on most performance metrics.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Systems, Man, and Cybernetics: Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.