Abstract

Imbalanced data classification is one of the challenging problems in machine learning. Oversampling is a promising technique that generates synthetic minority instances to balance the dataset. Inappropriate minority instances generated may deteriorate the performance of the classifier. Majority of the oversampling algorithms create new minority instances by choosing nearest neighbors for random interpolation. However, these methods do not provide new information to the dataset and therefore standard classifiers do not show good performance on such datasets. Therefore, it is necessary to generate diverse minority class instances to increase the performance of the classifier. Since, every feature of each minority class instance contribute valuable information, generating synthetic instances from the features of all minority instances would produce diverse minority instances, thereby increasing the performance of the classifier. This paper proposes a Hierarchical Heterogeneous Ant Colony Optimization based oversampling algorithm using Feature Similarity (HHACO-FSOTe) for generation of synthetic minority instances. Instead of choosing few neighbors for interpolation, the proposal considers all minority instances for generation of synthetic instances. HHACO-FSOTe generates new feature values by computing the minimum absolute difference between the features of a given minority instance and the corresponding features of the remaining minority instances. The features in the dataset are distributed among the ant agents enabling parallelism, thereby reducing the time taken for oversampling. HHACO-FSOTe do not require parameter tuning or training. The proposal is evaluated on 41 low dimensional, 11 high dimensional and 8 noisy datasets. Experiments reveal that HHACO-FSOTe is competent with the state-of-art oversampling techniques. Results were validated using non-parametric statistical tests.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.