Abstract

Regular expression matching as a core component of deep packet inspection (DPI) is widely used in various kinds of modern network intrusion detection system (NIDS), traffic classification system and network monitoring system, etc. In these systems, regular expressions are typically converted to deterministic finite automaton (DFA), and the DFA is used to scan and check each byte of incoming packet’s payload against regular expression rule sets to judge whether current packet is matched by any rule sets. If matched, it means the packet contains specific attacks, viruses, and so on. However, the DFA generally consumes a large amount of memory. Many recent improvement work mainly focus on how to reduce the amount of memory requirement. Like the previous work, in this paper we propose a compact, time-efficient and novel DFA structure to significantly decrease the DFA’s space, the new DFA called Reduced Input Character Set DFA (RICS-DFA). A character escaping and replacing scheme is first introduced to decrease DFA’s character set size and then to reduce DFA’s space requirement with a series of optimization techniques based on RICS-DFA. A RICS-DFA is constructed by transition rewriting. Experimental results on real-life rule-sets reveal that compared to the original DFA, the RICS-DFA reduces the memory consumption by 68 %–92 % while sacrificing trivial matching speed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call