Abstract

Log data records system state and runtime behaviors, and is usually used to diagnose system failures and detect anomalies. However, the accuracy of log-based anomaly detection algorithms will reduce dramatically in dynamic logs since the system more complex than ever before, a phenomenon known as concept drift. In this paper, we design a confidence-guide anomaly detection model that combines multiple algorithms, called Multi-CAD. We first propose a statistical value p_value to measure the non-conformity between logs and establish a link in the new log and previous logs, and can also choose multiple suitable algorithms as the non-conformity measure to calculate scores for combined detection instead of to make a decision. And then, we design a confidence-guided parameter adjustment method to anti-concept drift in dynamic logs and update the score set with the corresponding label from a trusted result that contains a label, non-conformity score, and confidence by a feedback mechanism as the previous experience for the following-up detection. Finally, we demonstrate that Multi-CAD will make a balance performance in precision rate, recall rate, and F_measure, and detect actual anomalies on multiple datasets. An extensive set of experiment results highlight that Multi-CAD will increase almost 20% on average in recall rate and F_measure compared with four typical algorithms on the HDFS benchmark dataset, where it achieves 98.2% in precision rate, 95.2% in recall rate, and 96.7% in F_measure.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call