Abstract

Most methods for classifying data streams operate under the hypothesis that the distribution of classes is balanced. Unfortunately, the phenomenon of class imbalance widely exists in many real-world applications. In addition, the underlying concept of data stream may change in a certain way over time, and attacks increase the difficulty of data stream mining. Motivated by this challenge, a Two-Stage Cost-Sensitive (TSCS) classification is proposed for addressing the class imbalance issue in non-stationary data streams. We propose a novel two-stage cost-sensitive framework for data stream classification by utilizing cost information in both feature selection stage and classification stage. Moreover, a window adaptation and drift detection mechanism, which guarantees that an ensemble can adapt promptly to concept drift, is embedded in our method. Our algorithm is compared with competitive algorithms on different kinds of datasets. The result demonstrates that TSCS obtains significant improvement in terms of class imbalance data stream metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call