Abstract
System logs record the status and important events of the system at different time periods. They are important resources for administrators to understand and manage the system. Detecting anomalies in logs is critical to identifying system faults in time. However, with the increasing size and complexity of today’s software systems, the number of logs has exploded. In many cases, the traditional manual log-checking method becomes impractical and time-consuming. On the other hand, existing automatic log anomaly detection methods are error-prone and often use indices or log templates. In this work, we propose LogLS, a system log anomaly detection method based on dual long short-term memory (LSTM) with symmetric structure, which regarded the system log as a natural-language sequence and modeled the log according to the preorder relationship and postorder relationship. LogLS is optimized based on the DeepLog method to solve the problem of poor prediction performance of LSTM on long sequences. By providing a feedback mechanism, it implements the prediction of logs that do not appear. To evaluate LogLS, we conducted experiments on two real datasets, and the experimental results demonstrate the effectiveness of our proposed method in log anomaly detection.
Highlights
Many log files are often produced during the operation of modern systems
This paper proposes a system log anomaly detection method based on dual long short-term memory (LSTM)
This article uses the authoritative dataset commonly used in system log anomaly detection: the HDFS log dataset disclosed by Wei Xu et al [5]
Summary
Many log files are often produced during the operation of modern systems. They reflect the running state of the system and record the activity information of specific events in the system. They are valuable resources to understand the state of the system. Rule-based exception detection [2] generally requires a manual analysis of logs and rule creation in advance, and the degree of automation is low. [3] created rule sets based on analyzing log time series information, which effectively reduced the false-positive rate of the system but with low automation and high labor cost Ref. [3] created rule sets based on analyzing log time series information, which effectively reduced the false-positive rate of the system but with low automation and high labor cost
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have