Abstract

Anomaly detection in data streams has become a major research problem in the era of ubiquitous sensing. We are collecting large amounts of data from non-stationary environments, which makes traditional anomaly detection techniques ineffective. In this paper we propose an unsupervised cluster-based algorithm for modelling normal behaviour in non-stationary data streams and detecting anomalous data points. We show that our method scales linearly with the number of observed data points, while the complexity of our model is independent of the size of the data stream. We have employed a selective clustering approach to optimize the computation time needed to model the normal data. Our experiments on large-scale synthetic and real life datasets show that the accuracy of the proposed algorithm is comparable to the state-of-the-art techniques reported in the literature while providing substantial improvements in terms of computation time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call