Abstract
With the rapid development of Internet, especially the mobile Internet, the new applications or network attacks emerge in a high rate in recent years. More and more traffic becomes unknown due to the lack of protocol specifications about the newly emerging applications. Automatic protocol reverse engineering is a promising solution for understanding this unknown traffic and recovering its protocol specification. One challenge of protocol reverse engineering is to determine the length of protocol keywords and message fields. Existing algorithms are designed to select the longest substrings as protocol keywords, which is an empirical way to decide the length of protocol keywords. In this paper, we propose a novel approach to determine the optimal length of protocol keywords and recover message formats of Internet protocols by maximizing the likelihood probability of message segmentation and keyword selection. A hidden semi-Markov model is presented to model the protocol message format. An affinity propagation mechanism based clustering technique is introduced to determine the message type. The proposed method is applied to identify network traffic and compare the results with existing algorithm.
Highlights
Network protocol specifications, describing the structure of protocol messages and regulating the behaviors of communication entities on the Internet, play an important role in addressing numbers of security or management oriented issues in several domains of computer and networking
Since there is no information about protocol keywords of binary protocols in published protocol specifications, we only evaluate protocol keyword extraction for text-based protocols (i.e., HTTP and SSDP)
The protocol keywords and message fields are inferred based on hidden semi-Markov model by maximizing the likelihood probability of message segmentation
Summary
Network protocol specifications, describing the structure of protocol messages and regulating the behaviors of communication entities on the Internet, play an important role in addressing numbers of security or management oriented issues in several domains of computer and networking. Intrusion detection systems and firewall systems require protocol specifications to perform deep packet inspection. Security experts spy and understand the specification of command & control (C&C) protocols [1] to detect and defend the botnets. Network management administrators build up application signatures based on protocol specifications to identify protocols and tunnels in monitored network traffic. The protocol specifications are powerful tools to enable the interoperation between multiple systems based on incompatible protocols [3,4,5]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.