Abstract

We address the problem of unsupervised anomaly detection for multivariate data. Traditional machine learning based anomaly detection algorithms rely on specific assumptions of normal patterns and fail to model complex feature interactions and relations. Recently, existing deep learning based methods are promising for extracting representations from complex features. These methods train an auxiliary task, e.g., reconstruction and prediction, on normal samples. They further assume that anomalies fail to perform well on the auxiliary task since they are never trained during the model optimization. However, the assumption does not always hold in practice. Deep models may also perform the auxiliary task well on anomalous samples, leading to the failure detection of anomalies. To effectively detect anomalies for multivariate data, this paper introduces a teacher-student distillation based framework Distillated Teacher-Student Network Ensemble (DTSNE). The paradigm of the teacher-student distillation is able to deal with high-dimensional complex features. In addition, an ensemble of student networks provides a better capability to avoid generalizing the auxiliary task performance on anomalous samples. To validate the effectiveness of our model, we conduct extensive experiments on real-world datasets. Experimental results show superior performance of DTSNE over competing methods. Analysis and discussion towards the behavior of our model are also provided in the experiment section.

Highlights

  • Detection (a.k.a. outlier detection) [1,2] is referred to as detecting data points that significantly deviate from normal behaviors

  • We present a novel framework to achieve efficient and accurate anomaly detection

  • We address the problem of fully unsupervised anomaly detection for multivariate data

Read more

Summary

Introduction

Detection (a.k.a. outlier detection) [1,2] is referred to as detecting data points that significantly deviate from normal behaviors. Identifying anomalies for multivariate data always provides valuable information in various domains. Due to the great empirical value, efficient and accurate anomaly detection algorithms are desired. Anomalies are associated with unknownness, irregularity, and rarity [2]. Unknownness indicates that anomaly events can not be observed until they happen. Irregularity means that the class structure of anomalies is highly heterogeneous. Anomalies are rare in terms of collected data, leading to the problem of class imbalance. Due to the difficulty of collecting a large-scale labeled anomaly dataset, fully supervised methods are impractical in realworld scenarios. Unsupervised anomaly detection does not require labeled training data

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call