Abstract

Logs is an important source of data in the field of security analysis. Log messages characterized by unstructured text, however, pose extreme challenges to security analysis. To this end, the first issue to be addressed is how to efficiently parse logs into structured data in real-time. The existing log parsers mostly parse raw log files by batch processing and are not applicable to real-time security analysis. It is also difficult to parse large historical log sets with such parsers. Some streaming log parsers also have some demerits in accuracy and parsing performance. To realize automatic, accurate, and efficient real-time log parsing, we propose Spray, a streaming log parser for real-time analysis. Spray can automatically identify the template of a real-time incoming log and accurately match the log and its template for parsing based on the law of contrapositive. We also improve Spray’s parsing performance based on key partitioning and search tree strategies. We conducted extensive experiments from such aspects as accuracy and performance. Experimental results show that Spray is much more accurate in parsing a variety of public log sets and has higher performance for parsing large log sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call