Abstract
In many real-world applications, an algorithm needs to learn multiclass classification models from data with imbalanced class distributions. Multiclass imbalanced learning is currently receiving increased attention from researchers. In contrast to traditional imbalanced learning on binary datasets, multiclass imbalanced learning faces great challenges from the variety of changes in the class distributions as well as the inadequate performance of multiclass classification algorithms. In this paper, we propose a novel data preprocessing-based method to solve this problem. The proposed method combines a one-versus-one (OVO) decomposition of class pairs and a spectral clustering technique. This method first decomposes a multiclass dataset into several binary-class datasets. Then, it uses spectral clustering to divide the minority classes of binary-class subsets into subspaces and oversamples them according to the characteristics of the data. Sampling based on spectral clustering takes into account the distribution of the data and effectively avoids oversampling outliers. After the data approximately reaches the equilibrium point, multiclass classifiers can be trained from these rebalanced data. We compared the proposed method with five state-of-the-art multiclass imbalanced learning methods on seven multiclass datasets, using multiclass area under the ROC curve (MAUC), the precision of minor classes (Pmin) and the average precision of all classes (Pavg) as the performance metrics. The experimental results show that our proposed method has the best overall performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.