Abstract
Logging is a common practice in software engineering to provide insights into working systems. The main uses of log files have always been failure identification and root cause analysis. In recent years, novel applications of logging have emerged that benefit from automated analysis of log files, for example, real-time monitoring of system health, understanding users’ behavior, and extracting domain knowledge. Although nearly every software system produces log files, the biggest challenge in log analysis is the lack of a common standard for both the content and format of log data. This paper provides a systematic review of recent literature (covering the period between 2000 and June 2021, concentrating primarily on the last five years of this period) related to automated log analysis. Our contribution is three-fold: we present an overview of various research areas in the field; we identify different types of log files that are used in research, and we systematize the content of log files. We believe that this paper serves as a valuable starting point for new researchers in the field, as well as an interesting overview for those looking for other ways of utilizing log information.
Highlights
The need to track a system’s behavior during its operation has been a common need since the beginning of software engineering
Log analysis extends the possibilities in traditional areas of the application of logging data – failure diagnosis and root cause analysis
We focus only on automated log analysis, which means that a paper needs to present a consistent, repeatable method for extracting certain information from log files for a particular purpose
Summary
The need to track a system’s behavior during its operation has been a common need since the beginning of software engineering. The main area of focus was failure diagnosis, and the most common form was the recording of actions taken by a system in log files. Studies such as [1] and [2] show that logging is a commonly used practice in the industry. With the rise of cloud computing, new challenges to logging practices have emerged – the distribution of log files among multiple services, a significant increase in log volumes, and a multitude of log formats. With a continually growing volume of logs and increasing dispersion of log files across services (especially in cloud environments), conducting a manual analysis becomes very challenging.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.