Abstract
Multivariate time series anomaly detection on key performance indicators helps mitigate the impact of large-scale IT system anomalies. Due to the large volume and the abstract nature of multivariate time series, previous works have tended to make overly strict or optimistic hypotheses on labeling costs and resulted in unsatisfactory results. Thus, it remains a challenge to make an appropriate trade-off between labeling costs and model performance. This research proposes AMAD, an active learning-based approach that works to address this problem. Its core idea is to provide the learner model high-value label queries via an ensemble query strategy, which dynamically adapts to the estimated anomaly ratio and the ever-changing model performance. Moreover, it is the first to discuss in detail the margin effect of query strategies on model performance, to our knowledge, this has not been investigated in previous works on time series anomaly detection. Extensive experiments on five public datasets demonstrate that AMAD works well and robustly on various real-world scenarios, which outperforms state-of-the-art baseline methods by 16% and 11% in terms of recall and F1, with only 3% of data being labeled.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have