Bayesian Nonparametric Unsupervised Concept Drift Detection for Data Stream Mining

Junyu Xuan,Jie Lu,Guangquan Zhang

doi:10.1145/3420034

Abstract

Online data stream mining is of great significance in practice because of its ubiquity in many real-world scenarios, especially in the big data era. Traditional data mining algorithms cannot be directly applied to data streams due to (1) the possible change of underlying data distribution over time (i.e.,concept drift) and (2) delayed, short, or even no labels for streaming data in practice. A new research area, namedunsupervised concept drift detection, has emerged to tackle this difficulty mainly based on two-sample hypothesis tests, such as the Kolmogorov–Smirnov test. However, it is surprising that none of the existing methods in this area exploit the Bayesian nonparametric hypothesis test, which has clear interpretability and straightforward prior knowledge encoding ability and no strict or unrealistic requirement of prefixing the form for the underlying data distribution. In this article, we present a Bayesian nonparametric unsupervised concept drift detection method based on the Polya tree hypothesis test. The basic idea is to decompose the underlying data distribution into a multi-resolution representation that transforms the whole distribution hypothesis test into recursive and simple binomial tests. Also, an incremental mechanism is especially designed to improve its efficiency in the stream setting. The method effectively detect drifts, and it also locates where a drift happens and the posteriors of hypotheses. The experiments on synthetic data verify the desired properties of the proposed method, and the experiments on real-world data show the better performance of the method for data stream mining compared with its frequentist counterpart in the literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bayesian Nonparametric Unsupervised Concept Drift Detection for Data Stream Mining

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Intelligent Systems and Technology

Lead the way for us

Journal: ACM Transactions on Intelligent Systems and Technology	Publication Date: Nov 13, 2020
Citations: 16

Similar Papers

Fast Unsupervised Online Drift Detection Using Incremental Kolmogorov-Smirnov Test
Denis Moreira Dos Reis ... Gustavo Batista
-
Denis Moreira Dos Reis, et. al.Denis Moreira Dos Reis ... Gustavo Batista
13 Aug 2016
13 Aug 2016

Unsupervised Online Concept Drift Detection Based on Divergence and EWMA
Qilin Fan ... Yang Li
-
Qilin Fan, et. al.Qilin Fan ... Yang Li
01 Jan 2023
01 Jan 2023

Detecting concept drift using HEDDM in data stream
Snehlata S Dongre ... Latesh G Malik
International Journal of Intelligent Engineering Informatics | VOL. 7
Snehlata S Dongre, et. al.Snehlata S Dongre ... Latesh G Malik
01 Jan 2019
International Journal of Intelligent Engineering Informatics | VOL. 7

Detecting concept drift using HEDDM in data stream
Snehlata S Dongre ... Latesh G Malik
International Journal of Intelligent Engineering Informatics | VOL. 7
Snehlata S Dongre, et. al.Snehlata S Dongre ... Latesh G Malik
01 Jan 2019
International Journal of Intelligent Engineering Informatics | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayesian Nonparametric Unsupervised Concept Drift Detection for Data Stream Mining

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Intelligent Systems and Technology