Abstract

In learning to classify data streams, it is impractical and expensive to label all of the instances. Online active learning over streaming data poses additional challenges for its increasing volumes and concept drifts. We propose a new online paired ensemble active learning framework consisting of a stable classifier and a timely substituted dynamic classifier to react to different types of concept drifts. Classifiers are built in block based way and will learn new instances incrementally online. According to a combination strategy of uncertainty strategy and random strategy, the decision whether to label the incoming instance for the updating of the stable classifier and the dynamic classifier will be made. Experimental evaluation results on real datasets show the advantage of the proposed work in comparison with other approaches.

Highlights

  • Data streams are widely produced in areas such as financial activities, traffic flow, sensor networks and web applications with the development of storage technology and networking architecture [1]

  • A new online paired ensemble active learning framework consisting of a stable classifier and a timely substituted dynamic classifier is proposed to react to different types of concept drift better

  • Algorithm 4: uncertaintyStrategy(x) input: x: incoming instance, E: ensemble classifier built with Cs and Cd θm =0.6/numberOfClasses,uncertainty threshold s=0.1, step to adjust threshold θm output: boolean variable labelling indicates whether to request the true label of Ix. 1 margin(x)=PE(ŷc1|x)-PE(ŷc2|x); 2 If (margin(x)< θm) 3 θm =θm * (1 - s/numberOfClasses); 4 return labelling=true; 5 Else 6 return labelling=false ; 7 End If

Read more

Summary

Introduction

Data streams are widely produced in areas such as financial activities, traffic flow, sensor networks and web applications with the development of storage technology and networking architecture [1] In these dynamic environments, data streams are massive, temporally ordered, fast changing and potentially infinite [2]. Once the chunk size is too big the ensembles may react too slowly for sudden drifts as old classifiers still have remain weights. In order to adapt to both sudden and gradual changes, it could be suitable to combine significant features from block-based ensembles and incremental learning approaches. A new online paired ensemble active learning framework consisting of a stable classifier and a timely substituted dynamic classifier is proposed to react to different types of concept drift better.

Related Work
Online Paired Ensemble Active Learning Framework for Drifted Data Streams
15 End For
Experimental Evaluation
Datasets
Accuracy Evaluation
Sensitivity to Parameters
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call