Abstract

For an imbalanced dataset, traditional machine learning methods usually misclassify minority samples due to the indicator evaluating classification accuracy biased toward majority class. To address the issue, manifold cluster-based evolutionary ensemble imbalance learning is proposed, with the purpose of providing a more effective framework for building an optimal imbalance classifier. After mapping the original data to manifold space, majority samples are removed from each sub-cluster in terms of their distribution characteristic. Following that, a new one is generated in each minority sub-cluster by over-sampling, with the purpose of avoiding a misclassified new minority sample that produced from small disjuncts. In above manifold clustering-based resampling techniques, optional operations and key parameters for normalization, manifold learning, clustering, under-sampling and over-sampling form various combination. Thus, evolutionary algorithm is introduced to seek the optimal structure for MECS-Ensemble. Each individual is encoded by five integer and six real number, and a fitness function is designed to evaluate its classification accuracy and the diversity of majority samples. The statistical experimental results for 39 imbalanced datasets show that MECS-Ensemble proposed in the paper is superior to the other imbalance learning methods, especially, manifold clustering-based resampling technique contributes to significant performance improvements.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.