Abstract

The class imbalance issue is prevalent in various practical classification tasks. A high unbalanced rate will significantly decrease the classification performance of unbalanced learning. However, existing methods for highly unbalanced data classification still face two key difficulties: (1) fairly learning key information, and (2) maintaining consistency. To address these difficulties, we propose a novel majority clustering-based adaptive undersampling enhanced ensemble classification method, which integrates undersampling and ensemble techniques. In the adaptive undersampling process, we first consider the spatial distribution of majority samples to ensure distribution consistency. We then consider an adaptive sampling rate and introduce a feedback mechanism to obtain more representative majority samples from each cluster. In the classifier ensemble process, multiple ensemble iterations are introduced to achieve fair attention to key information in different classes. Finally, six kinds of experiments are conducted on 17 real highly unbalanced datasets from multiple fields. Experimental results demonstrate that the proposed method outperforms existing methods in terms of effectiveness, robustness, and adaptability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call