Abstract

Continuous outlier detection in data streams is an important topic in data mining and has applications in various domains such as fraud detection, weather analysis, and intrusion detection. The non-stationary characteristic of real-world data streams brings the challenge of updating the outlier detection model in a timely and accurate manner. In this paper, we propose a framework for outlier detection in non-stationary data streams (O-NSD) which detects changes in the underlying data distribution to trigger a model update. We propose an improved distance function between sliding windows which offers a monotonicity property; we develop two accurate change detection algorithms, one of which is parameter-free; and we further propose new evaluation measures that quantify the timeliness of the detected changes. Our extensive experiments with real-world and synthetic datasets show that our change detection algorithms outperform the state-of-the-art solution. In addition, we demonstrate our O-NSD framework with two popular unsupervised outlier classifiers. Empirical results show that our framework offers higher accuracy and requires a much lower running time, compared to retrain-based and incremental update approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call