Abstract

In practical applications, imbalanced data has brought great challenges to classification problems. In this paper, we propose two new methods: (1) a new oversampling method, named Tomek-CASUWO, to address the issue of class imbalance; (2) a new classifier, named ILS-RBFNN, to increase the accuracy of prediction of customer consumption behavior. Since current ASUWO algorithm is easily affected by noise samples and fuzzy class boundaries, we propose Tomek-CASUWO to address these problems: (i) the Tomek links algorithm is used to filter noise samples; (ii) CASUWO is used to avoid overlapping class boundaries; (iii) Tomek-CASUWO is used to synthesize new samples. Also, we propose a new classifier based on RBFNN, named ILS-RBFNN, to improve the prediction accuracy: (i) the hybrid kernel is developed by combining Gaussian and Polynomial; (ii) an Immune Algorithm (IA) is used to optimize the centers of RBFNN; (iii) Least-Squares (LS) is used to calculate the biases and weights. Wine-consumer behavior data is used to compare our Tomek-CASUWO with other oversampling methods. We compare ILS-RBFNN with several well-known kernel functions and parameter update methods. The experimental results show that Tomek-CASUWO can significantly improve the prediction accuracy of a classifier, and ILS-RBFNN outperforms other classification methods. We also conduct experiments on the extended real-world dataset. Finally, the robustness and applicability of ILS-RBFNN are verified on the eleven UCI datasets. All results show that the proposed two methods outperform existing models. The experimental results also demonstrate the effectiveness and practicability of ILS-RBFNN for predicting customer behavior.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.