Abstract

Learning from non-stationary data streams is inherently challenging due to their evolving nature and concept drift. Furthermore, the assumption that all instances come labeled is often impractical in real-world applications. Many strategies have been proposed to tackle learning from sparsely labeled data streams. However, they typically rely on fixed labeling budgets, which can be a limitation in the context of drifting data streams. In this study, we introduce a novel active learning strategy that dynamically manages the labeling budget to optimize its utilization and adapt promptly to concept drift. Our approach continuously monitors the data stream for concept drift, and upon detecting such drift, it dynamically increases the maximum labeling budget for a predefined time window. This adjustment provides the classifier with more flexibility to adapt to the new concept. We conducted experiments using 7 synthetic data generators encompassing various drifting scenarios and 7 real-world data streams with different labeling budgets. Our results demonstrate that offering a flexible budget to the classifier can significantly enhance performance compared to merely increasing a fixed budget. Notably, our strategy outperformed state-of-the-art active learning strategies, all while maintaining a comparable or lower number of labeled instances. Experiments are available at https://github.com/gabrieljaguiar/DBAL.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call