Learning evolving prototypes for imbalanced data stream classification with limited labels

Zhonglin Wu,Hongliang Wang,Jingxia Guo,Qinli Yang,Junming Shao

doi:10.1016/j.ins.2024.120979

Abstract

Real-world data streams often exhibit long-tailed distributions with heavy class imbalance, posing great challenges for data stream classification, especially in the case of label scarcity and concept drift. Several active learning methods have been proposed to address this problem by selecting the most valuable instances for labeling. However, existing methods often struggle to dynamically identify the most valuable instances that truly represent the current concept while still requiring a large label budget. In this work, we propose a new algorithm, LEPID, to combine dynamic micro-cluster concept modeling and local entropy modeling to select current important concepts and prototypes. Specifically, we give greater weight to concept drift prototypes and minority prototypes to focus more on those regions that represent current concepts. We use a local entropy strategy based on micro-clusters to select the most valuable instances for labeling and reduce the label budget. Extensive experiments on real-world and synthetic imbalanced datasets show that, compared to state-of-the-art algorithms, our method can naturally adapt to concept drift and dynamically capture the current and most valuable prototypes to achieve better results even in the case of label scarcity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning evolving prototypes for imbalanced data stream classification with limited labels

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Similar Papers

Dynamic budget allocation for sparsely labeled drifting data streams
Gabriel J Aguiar ... Alberto Cano
Information Sciences | VOL. 654
Gabriel J Aguiar, et. al.Gabriel J Aguiar ... Alberto Cano
31 Oct 2023
Information Sciences | VOL. 654

A non-canonical hybrid metaheuristic approach to adaptive data stream classification
Hossein Ghomeshi ... Yevgeniya Kovalchuk
Future Generation Computer Systems | VOL. 102
Hossein Ghomeshi, et. al.Hossein Ghomeshi ... Yevgeniya Kovalchuk
01 Aug 2019
Future Generation Computer Systems | VOL. 102

A framework for application-driven classification of data streams
Peng Zhang ... Li Guo
Neurocomputing | VOL. 92
Peng Zhang, et. al.Peng Zhang ... Li Guo
13 Mar 2012
Neurocomputing | VOL. 92

Cost-Sensitive Classification for Evolving Data Streams with Concept Drift and Class Imbalance.
Yange Sun ... Han Shao
Computational Intelligence and Neuroscience | VOL. 2021
Yange Sun, et. al.Yange Sun ... Han Shao
01 Jan 2020
Computational Intelligence and Neuroscience | VOL. 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning evolving prototypes for imbalanced data stream classification with limited labels

Abstract

Talk to us

Similar Papers

More From: Information Sciences