Recurring Drift Detection and Model Selection-Based Ensemble Classification for Data Streams with Unlabeled Data

Peipei Li,Junhong He,Xuegang Hu,Man Wu

doi:10.1007/s00354-021-00126-2

Abstract

Data stream classification is widely popular in the field of network monitoring, sensor network and electronic commerce, etc. However, in the real-world applications, recurring concept drifting and label missing in data streams seriously aggravate the difficulty on the classification solutions. And this challenge has received little attention from the research community. Motivated by this, we propose a new ensemble classification approach based on the recurring concept drifting detection and model selection for data streams with unlabeled data. First, we build an ensemble model based on the classifiers and clusters. To improve the classification accuracy, we use the ensemble model to predict each data chunk and partition clusters according to the distribution of predicted class labels. Second, we adopt a new concept drifting detection method based on the divergence of concept distributions between adjoining data chunks to distinguish recurring concept drifts. All historical new concepts will be maintained. Meanwhile, we introduce the time-stamp-based weights for base models in the ensemble model. In the selection of the base model, we consider the time-stamp-based weight and the divergence between concept distributions simultaneously. Finally, extensive experiments conducted on four benchmark data sets show that our approach can quickly adapt to data streams with recurring concept drifts, and improve the classification accuracy compared to several state-of-the-art classification algorithms for data streams with concept drifts and unlabeled data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Recurring Drift Detection and Model Selection-Based Ensemble Classification for Data Streams with Unlabeled Data

Abstract

Talk to us

Similar Papers

More From: New Generation Computing

Lead the way for us

Journal: New Generation Computing	Publication Date: Apr 20, 2021
Citations: 4

Similar Papers

No Free Lunch Theorem for concept drift detection in streaming data classification: A review
Hanqing Hu ... Tegjyot S Sethi
WIREs Data Mining and Knowledge Discovery | VOL. 10
Hanqing Hu, et. al.Hanqing Hu ... Tegjyot S Sethi
02 Jul 2019
WIREs Data Mining and Knowledge Discovery | VOL. 10

Adaptive Classification for Concept Drifting Data Streams with Unlabeled Data
...
International Review on Computers and Software | VOL. 8
, et. al. ...
30 Jun 2013
International Review on Computers and Software | VOL. 8

Incremental semi-supervised Extreme Learning Machine for Mixed data stream classification
Qiude Li ... Min Gao
Expert Systems with Applications | VOL. 185
Qiude Li, et. al.Qiude Li ... Min Gao
14 Jul 2021
Expert Systems with Applications | VOL. 185

Peer to peer botnet detection for cyber-security
Mohammad M Masud ... Jiawei Han
-
Mohammad M Masud, et. al.Mohammad M Masud ... Jiawei Han
12 May 2008
12 May 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Recurring Drift Detection and Model Selection-Based Ensemble Classification for Data Streams with Unlabeled Data

Abstract

Talk to us

Similar Papers

More From: New Generation Computing