Abstract

In this paper, we propose FlowSifter, a framework for automated online application protocol field extraction. FlowSifter is based on a new grammar model called Counting Regular Grammars (CRG) and a corresponding automata model called Counting Automata (CA). The CRG and CA models add counters with update functions and transition guards to regular grammars and finite state automata. These additions give CRGs and CAs the ability to parse and extract fields from context sensitive application protocols. These additions also facilitate fast and stackless approximate parsing of recursive structures. These new grammar models enable FlowSifter to generate optimized Layer 7 field extractors from simple extraction specifications. We compare FlowSifter against both BinPAC and UltraPAC, which represent the state-of-the-art field extractors. Our experiments show that when compared to BinPAC parsers, FlowSifter runs more than 21 times faster and uses 49 times less memory. When compared to UltraPAC parsers, FlowSifter extractors run 12 times faster and use 24 times less memory.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call