Abstract

Learning from the non-stationary imbalanced data stream is a serious challenge to the machine learning community. There is a significant number of works addressing the issue of classifying non-stationary data stream, but most of them do not take into consideration that the real-life data streams may exhibit high and changing class imbalance ratio, which may complicate the classification task. This work attempts to connect two important, yet rarely combined, research trends in data analysis, i.e., non-stationary data stream classification and imbalanced data classification. We propose a novel framework for training base classifiers and preparing the dynamic selection dataset (DSEL) to integrate data preprocessing and dynamic ensemble selection (DES) methods for imbalanced data stream classification. The proposed approach has been evaluated on the basis of computer experiments carried out on 72 artificially generated data streams with various imbalance ratios, levels of label noise and types of concept drift. In addition, we consider six variations of preprocessing methods and four DES methods. Experimentation results showed that dynamic ensemble selection, even without the use of any data preprocessing, can outperform a naive combination of the whole pool generated with the use of preprocessing methods. Combining DES with preprocessing further improves the obtained results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.