Abstract

In stream learning, data continuously arrives over time, often at a very high rate. For imbalanced data streams with concept drift, it becomes essential to simultaneously address classification accuracy and time efficiency. However, existing algorithms currently either demand excessive computational time or exhibit relatively lower classification performance. To address these issues, in this paper, a cost-sensitive continuous ensemble kernel learning method (CCEKL) is proposed to deal with the imbalanced data streams with concept drift. Firstly, for two-class data streams, a novel misclassification cost is proposed to adapt to the changing imbalance rate. Secondly, based on the modified loss function, a continuous kernel learning method is applied to adapt to the changing data distribution. Furthermore, the two-class method is extended to multi-class classification to deal with multi-class dynamic imbalance problems. Finally, we use an ensemble approach to address the proposed algorithm sensitivity to the initial kernel width. To further improve efficiency, a parallel approach is applied to simultaneously train multi-classifiers with different initial kernel widths. Experimental results on 30 data streams reveal that, for binary classification, CCEKL achieves superior balanced accuracy in a shorter training time. For multi-class classification, CCEKL attains better balanced accuracy without requiring excessive training time, surpassing the most of the state-of-the-art baseline algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call