Abstract

The classification in imbalanced datasets is one of the main problems for machine learning techniques. Support vector machine (SVM) is biased to the majority class samples, and the minority class samples may incorrectly be considered as noise. Therefore, SVM has poor predictive accuracy for imbalanced datasets and generates inaccurate classification models. Existing class imbalance learning (CIL) techniques can make SVM less sensitive to class imbalance, but these methods suffer from issues related to noise and outliers. Moreover, despite the solid theoretical basis and good classification performance, SVM is not appropriate for the classification of large-scale datasets because the training complexity of SVM is closely related to the dataset size. Class imbalance learning (CIL) using Fuzzy adaptive resonance theory (ART) and intuitionistic fuzzy twin SVM (CIL-FART-IFTSVM), which can be applied to address the class imbalance issue in the presence of noise and outliers and large scale datasets, is proposed to overcome these substantial difficulties. In this method, we modify the distribution of the datasets using fuzzy adaptive resonance theory (Fuzzy ART) as a clustering method to overcome the imbalance problem. Then, after data reduction, IFTSVM is utilized to find excellent non-parallel hyperplanes in the generated data points. Finally, a coordinate descent system with shrinking by an active set is applied to reduce the computational complexity. Forty-five imbalanced datasets are considered to validate the performance of the proposed CIL-FART-IFTSVM method. The Friedman test and the bootstrap technique with 95% confidence intervals are applied to quantify the results statistically. The experimental results indicate that the method proposed in this paper has a better performance compared with other methods, and the training time is significantly better than that of other classifiers for large-scale datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call