Abstract

Regular expressions have become a fixture in network security systems such as Network Intrusion Detection, Spam email filtering, and Antivirus. Unfortunately, regular expressions require considerably more resources in matching over fixed binary or character strings. Much research has focused on improving matching architectures or hardware support to create more efficient regular expression matching. This research, however, investigated whether or not the regular expression set itself contained any lever that might make for creating more efficient automata prior to moving such automata to any specific matching architecture or hardware. We found that typical Non-deterministic Finite Automata (NFA) construction methodologies create redundant paths in the NFA when used with the complex rule-sets employed in network security. This stems directly from the fact that creating optimized NFA is a hard problem. As such, we created REduce, a tool that uses shared prefixes among regular expressions as a heuristic to eliminate redundant paths among shared prefixes within constructed NFA. The end result is smaller matching automata (between 4-50% depending on the rule-set) and a 4-900% improvement in throughput due to reductions in active state. More importantly, REduce only targets NFA construction, thus the generated NFA can be converted to any specific matching architecture or hardware for cumulative improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call