An adaptive algorithm for anomaly and novelty detection in evolving data streams

Mohamed-Rafik Bouguelia,Amir H Payberah,Slawomir Nowaczyk

doi:10.1007/s10618-018-0571-0

Mohamed-Rafik Bouguelia, Amir H Payberah + Show 1 more

Open Access

https://doi.org/10.1007/s10618-018-0571-0

Copy DOI

Abstract

In the era of big data, considerable research focus is being put on designing efficient algorithms capable of learning and extracting high-level knowledge from ubiquitous data streams in an online fashion. While, most existing algorithms assume that data samples are drawn from a stationary distribution, several complex environments deal with data streams that are subject to change over time. Taking this aspect into consideration is an important step towards building truly aware and intelligent systems. In this paper, we propose GNG-A, an adaptive method for incremental unsupervised learning from evolving data streams experiencing various types of change. The proposed method maintains a continuously updated network (graph) of neurons by extending the Growing Neural Gas algorithm with three complementary mechanisms, allowing it to closely track both gradual and sudden changes in the data distribution. First, an adaptation mechanism handles local changes where the distribution is only non-stationary in some regions of the feature space. Second, an adaptive forgetting mechanism identifies and removes neurons that become irrelevant due to the evolving nature of the stream. Finally, a probabilistic evolution mechanism creates new neurons when there is a need to represent data in new regions of the feature space. The proposed method is demonstrated for anomaly and novelty detection in non-stationary environments. Results show that the method handles different data distributions and efficiently reacts to various types of change.

Highlights

Usual machine learning and data mining methods learn a model by performing several passes over a static dataset
We propose in this paper an extension of the Growing Neural Gas (GNG) algorithm named GNG-A and we show how it is used for novelty and anomaly detection in evolving data streams
GNG-A is summarized in Algorithm 2, which makes a call to Algorithm 3 to check for the removal of neurons and Algorithm 4 to check for the creation of neurons

Summary

Introduction

Usual machine learning and data mining methods learn a model by performing several passes over a static dataset. We address the question of how to incrementally adapt to changes in a non-stationary distribution without requiring sensitive hyper-parameters to be manually tuned The problem is both interesting and important as evolving data streams are present in a large number of dynamic processes These methods require an expert to specify some sensitive parameters that directly affect the evolution or the forgetting rate of the neural network Setting such global parameters prior to the learning does not address the more general case where the speed of changes can vary over time, or when the distribution becomes non-stationary only in some specific regions of the feature space.

Preliminaries and related work

21: Exponentially decrease the representation error of all neurons:

Adaptation of existing neurons

Forgetting by removing irrelevant neurons

Estimating the relevance of a neuron

Adaptive removal of neurons

Dynamic creation of new neurons

Algorithm

16: Remove the neurons that become isolated

22: Remove previous neighbors of nthat become isolated

Experiments

Datasets

General properties of GNG-A

Anomaly and novelty detection

Conclusion and future work

A Details about datasets

B Details about parameters

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data Mining and Knowledge Discovery	Publication Date: May 12, 2018
Citations: 24	License type: open-access

R Discovery Prime

R Discovery Prime

An adaptive algorithm for anomaly and novelty detection in evolving data streams

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery

Lead the way for us

Similar Papers

Ensemble Clustering for Novelty Detection in Data Streams
Kemilly Dearo Garcia ... Cláudio Rebelo De Sá
-
Kemilly Dearo Garcia, et. al.Kemilly Dearo Garcia ... Cláudio Rebelo De Sá
01 Jan 2019
01 Jan 2019

An autoencoder-based fast online clustering algorithm for evolving data stream
Dazheng Gao
-
Dazheng GaoDazheng Gao
17 Mar 2023
17 Mar 2023

Evaluation of Multiclass Novelty Detection Algorithms for Data Streams
Elaine Ribeiro De Faria ... Jo Ao Gama
IEEE Transactions on Knowledge and Data Engineering | VOL. 27
Elaine Ribeiro De Faria, et. al.Elaine Ribeiro De Faria ... Jo Ao Gama
01 Nov 2015
IEEE Transactions on Knowledge and Data Engineering | VOL. 27

Tutorial: Data Stream Mining and Its Applications
Latifur Khan ... Wei Fan
-
Latifur Khan, et. al.Latifur Khan ... Wei Fan
01 Jan 2012
01 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An adaptive algorithm for anomaly and novelty detection in evolving data streams

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery