Robust Textual Data Streams Mining Based on Continuous Transfer Learning

Bo Liu,Yanshan Xiao,Longbing Cao,Zhifeng Hao,Philip S Yu

doi:10.1137/1.9781611972832.81

Abstract

In textual data stream environment, concept drift can occur at any time, existing approaches partitioning streams into chunks can have problem if the chunk boundary does not coincide with the change point which is impossible to predict. Since concept drift can occur at any point of the streams, it will certainly occur within chunks, which is called random concept drift. The paper proposed an approach, which is called chunk level-based concept drift method (CLCD), that can overcome this chunking problem by continuously monitoring chunk characteristics to revise the classifier based on transfer learning in positive and unlabeled (PU) textual data stream environment. Our proposed approach works in three steps. In the first step, we propose core vocabulary-based criteria to justify and identify random concept drift. In the second step, we put forward the extension of LELC (PU learning by extracting likely positive and negative microclusters)[1], called soft-LELC, to extract representative examples from unlabeled data, and assign a confidence score to each extracted example. The assigned confidence score represents the degree of belongingness of an example towards its corresponding class. In the third step, we set up a transfer learning-based SVM to build an accurate classifier for the chunks where concept drift is identified in the first step. Extensive experiments have shown that CLCD can capture random concept drift, and outperforms state-of-the-art methods in positive and unlabeled textual data stream environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robust Textual Data Streams Mining Based on Continuous Transfer Learning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Quantum-Inspired Direct Learning Strategy for Positive and Unlabeled Data
Chenguang Zhang ... Yan Zhang
International Journal of Computational Intelligence Systems | VOL. 16
Chenguang Zhang, et. al.Chenguang Zhang ... Yan Zhang
06 Dec 2023
International Journal of Computational Intelligence Systems | VOL. 16

Boosting Positive and Unlabeled Learning for Anomaly Detection With Multi-Features
Jiaqi Zhang ... Junsong Yuan
IEEE Transactions on Multimedia | VOL. 21
Jiaqi Zhang, et. al.Jiaqi Zhang ... Junsong Yuan
01 May 2019
IEEE Transactions on Multimedia | VOL. 21

Positive and Unlabeled Learning with Label Disambiguation
Chuang Zhang ... Jian Yang
-
Chuang Zhang, et. al.Chuang Zhang ... Jian Yang
01 Aug 2019
01 Aug 2019

An evolutionary multi-objective approach to learn from positive and unlabeled data
Jianfeng Qiu ... Guanglong Fu
Applied Soft Computing | VOL. 101
Jianfeng Qiu, et. al.Jianfeng Qiu ... Guanglong Fu
08 Dec 2020
Applied Soft Computing | VOL. 101

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust Textual Data Streams Mining Based on Continuous Transfer Learning

Abstract

Talk to us

Similar Papers