Abstract

AbstractOnline supervised learning from fast-evolving data streams, particularly in domains such as health, the environment, and manufacturing, is a crucial research area. However, these domains often experience class imbalance, which can skew class distributions. It is essential for online learning algorithms to analyze large datasets in real-time while accurately modeling rare or infrequent classes that may appear in bursts. While methods have been proposed to handle binary class imbalance, there is a lack of attention to multi-class imbalanced settings with varying degrees of imbalance in evolving streams. In this paper, we present the Dynamic Queues (DynaQ) algorithm for online learning in multi-class imbalanced settings to fill this knowledge gap. Our approach utilizes a batch-based resampling method that creates an instance queue for each class to balance the number of instances. We maintain a queue threshold and remove older samples during training. Additionally, we dynamically oversample minority classes based on one of four rate parameters: recall, F1-score, $$\kappa _m$$ κ m , and Euclidean distance. Our learning algorithm consists of an ensemble that uses sliding windows and a soft voting schema while incorporating a drift detection mechanism. Our experimental results demonstrate the superiority of the DynaQ approach over state-of-the-art methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.