Abstract

Time series data is pervasive in many applications and the anomaly detection about it is important, which will provide the early warning of some unexpected patterns. In this paper, we propose a multiple similarity based anomalous subsequences detection method, which is unsupervised and domain knowledge free. Firstly, to improve the time efficiency, an anomaly candidates selection scheme is introduced based on the locality sensitive hashing (LSH), which considers a subsequence that does not collide with the others as a potential anomaly. However, if the raw time series is noisy and the anomaly is subtle, the performance of LSH will be degraded. In order to address this problem, we present a smoothing method to remove the noise and highlight the anomalous part in a time series, which can help to decrease the collision probability between an anomaly and the other subsequences. Secondly, we employ Pareto analysis to incorporate multiple similarity measures since there are different types of anomalies in real applications. It is unlikely that a single similarity measure can perform consistently well on different types of anomalies. Thirdly a new anomaly score scheme is provided to evaluate each anomaly candidate, which is based on the number of non-dominated vectors. Finally, we conduct extensive experiments on benchmark datasets from diverse domains and compare our method with the state-of-the-art approaches. The results show that our method can reach higher accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call