Three-layer concept drifting detection in text data streams

Yuhong Zhang,Guang Chu,Peipei Li,Xuegang Hu,Xindong Wu

doi:10.1016/j.neucom.2017.04.047

Abstract

Text data streams have widely appeared in real-world applications, in which, concept drifts owe a significant challenge for classification. Compared with relational data streams, concept drifts hidden in text streams usually reflect in the relationship between the feature vector and the instance labels. Meanwhile, existing concept drifting detection methods are mainly based on error rates of classification. When applying these methods in text streams, they perform poorly in the evaluations of false alarms and missing detections, etc. Motivated by this, we firstly give a systematic analysis of the concept drifts in text data streams. Then, we propose a three-layer concept drifting detection approach, where the three layers indicate the layer of label space, the layer of feature space and the layer of the mapping relationships between labels and features, respectively. In this approach, the latter two layers are based on the values of WoE (Weight of Evidence) and the IV (Information Value) index. Experimental results show that our approach can improve the performance of concept drifting detection and the accuracy of classification, especially when concept drifts in text data streams are frequent.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Three-layer concept drifting detection in text data streams

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: May 10, 2017
Citations: 36

Similar Papers

Vote-Based LELC for Positive and Unlabeled Textual Data Streams
Bo Liu ... Longbing Cao
-
Bo Liu, et. al.Bo Liu ... Longbing Cao
01 Dec 2010
01 Dec 2010

Events and Trends in Text Streams
Dave Engel ... Nick Cramer
-
Dave Engel, et. al.Dave Engel ... Nick Cramer
04 Mar 2010
04 Mar 2010

Finding surprising patterns in textual data streams
Tristan Snowsill ... Tijl De Bie
-
Tristan Snowsill, et. al.Tristan Snowsill ... Tijl De Bie
01 Jun 2010
01 Jun 2010

An Ensemble Classification Algorithm for Text Data Stream based on Feature Selection and Topic Model
Zhongxin Wang ... Zhengqi Ding
-
Zhongxin Wang, et. al.Zhongxin Wang ... Zhengqi Ding
01 Jun 2020
01 Jun 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Three-layer concept drifting detection in text data streams

Abstract

Talk to us

Similar Papers

More From: Neurocomputing