Abstract

Outlier detection aims to find a data sample that is different from most other data samples. While outlier detection is performed at an individual instance level, anomaly pattern detection on a data stream means detecting a time point where a pattern to generate data is unusual and significantly different from normal behavior. Beyond predicting the outlierness of individual data samples in a data stream, it can be very useful to detect the occurrence of anomalous patterns in real time. In this paper, we propose a method for anomaly pattern detection in a data stream based on binary classification for outliers and statistical tests on a data stream of binary labels of normal or an outlier. In the first step, by applying the clustering-based outlier detection method, we transform a data stream into a stream of binary values where 0 stands for the prediction as normal data and 1 for outlier prediction. In the second step, anomaly pattern detection is performed on a stream of binary values by two approaches: testing the equality of parameters in the binomial distributions of a reference window and a detection window, and using control charts for the fraction defective. The proposed method obtained the average true positive detection rate of 94% in simulated experiments using real and artificial data. The experimental results also show that anomaly pattern occurrence can be detected reliably even when outlier detection performance is relatively low.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.