Abstract

With many network protocols using obfuscation techniques to hide their identity, robust methods of traffic classification are required. In traditional deep-packet-inspection (DPI) methods, application specific signatures are generated with byte-level data from payload. Increasingly new data formats are being used to encode the application protocols with bit-level information which render the byte-level signatures ineffective. In this paper, we describe BitCoding a bit-level DPI-based signature generation technique. BitCoding uses only a small number of initial bits from a flow and identify invariant bits as signature. Subsequently, these bit signatures are encoded and transformed into a newly defined state transition machine transition constrained counting automata. While short signatures are efficient for processing, this will increase the chances of collision and cross signature matching with increase in number of signatures (applications). We describe a method for signature similarity detection using a variant of Hamming distance and propose to increase the length of signatures for a subset of protocols to avoid overlaps. We perform extensive experiments with three different data sets consisting of 537 380 flows with a packet count of 3 445 969 and show that, BitCoding has very good detection performance across different types of protocols (text, binary, and proprietary) making it protocol-type agnostic. Further, to understand the portability of signatures generated we perform cross evaluation, i.e., signatures generated from one site are used for testing with data from other sites to conclude that it will lead to a small compromise in detection performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.