Abstract

Because of its accuracy, pattern matching technique has recently been applied to Internet security applications such as intrusion detection/prevention, anti-virus, and anti-malware. Among various famous pattern matching algorithms, the Aho-Corasick (AC) can match multiple pattern strings simultaneously with worst-case performance guarantee and is adopted in both Clam antivirus (ClamAV) and Snort intrusion detection open sources. The AC algorithm is based on finite automaton which can be implemented straightforwardly with a two-dimensional state transition table. However, the memory requirement prohibits such an implementation when the total length of the pattern strings is large. The ClamAV implementation limits the depth of the finite automaton and combines with linked lists to reduce memory requirement. The banded-row format is adopted to compress the state transition table and used as an alternative pattern matching machine in Snort. In this paper we present a novel implementation which requires small memory space and achieves high throughput performance. Compared with the banded-row format, our proposed scheme achieves 39.7% reduction in memory requirement for 5,000 patterns randomly selected from ClamAV signatures. Besides, the processing time of our proposed scheme is, on the average, 83.9% of that of the banded-row format for scanning various types of files. Compared with the ClamAV implementation with the same 5,000 patterns and files, our proposed scheme requires slightly more memory space but achieves 80.6% reduction in processing time on the average.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.