Expected similarity estimation for large-scale batch and streaming anomaly detection

Markus Schneider,Wolfgang Ertel,Fabio Ramos

doi:10.1007/s10994-016-5567-7

Abstract

We present a novel algorithm for anomaly detection on very large datasets and data streams. The method, named EXPected Similarity Estimation (EXPoSE), is kernel-based and able to efficiently compute the similarity between new data points and the distribution of regular data. The estimator is formulated as an inner product with a reproducing kernel Hilbert space embedding and makes no assumption about the type or shape of the underlying data distribution. We show that offline (batch) learning with EXPoSE can be done in linear time and online (incremental) learning takes constant time per instance and model update. Furthermore, EXPoSE can make predictions in constant time, while it requires only constant memory. In addition, we propose different methodologies for concept drift adaptation on evolving data streams. On several real datasets we demonstrate that our approach can compete with state of the art algorithms for anomaly detection while being an order of magnitude faster than most other approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Expected similarity estimation for large-scale batch and streaming anomaly detection

Abstract

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Journal: Machine Learning	Publication Date: May 18, 2016
Citations: 41

Similar Papers

A Systematic Review of Density Grid-Based Clustering for Data Streams
Mustafa Tareq ... Elankovan A Sundararajan
IEEE Access | VOL. 10
Mustafa Tareq, et. al.Mustafa Tareq ... Elankovan A Sundararajan
01 Jan 2021
IEEE Access | VOL. 10

Prequential AUC for Classifier Evaluation and Drift Detection in Evolving Data Streams
Dariusz Brzezinski ... Jerzy Stefanowski
-
Dariusz Brzezinski, et. al.Dariusz Brzezinski ... Jerzy Stefanowski
01 Jan 2015
01 Jan 2015

A Comprehensive Review on Evolving Data Stream Clustering
Kareema Batool ... Ghulam Abbas
-
Kareema Batool, et. al.Kareema Batool ... Ghulam Abbas
21 Sep 2021
21 Sep 2021

Tutorial: Data Stream Mining and Its Applications
Latifur Khan ... Wei Fan
-
Latifur Khan, et. al.Latifur Khan ... Wei Fan
01 Jan 2012
01 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Expected similarity estimation for large-scale batch and streaming anomaly detection

Abstract

Talk to us

Similar Papers

More From: Machine Learning