Abstract

With the rapid development of Internet, especially the mobile Internet, the new applications or network attacks emerge in a high rate in recent years. More and more traffic becomes unknown due to the lack of protocol specifications about the newly emerging applications. Automatic protocol reverse engineering is a promising solution for understanding this unknown traffic and recovering its protocol specification. One challenge of protocol reverse engineering is to determine the length of protocol keywords and message fields. Existing algorithms are designed to select the longest substrings as protocol keywords, which is an empirical way to decide the length of protocol keywords. In this paper, we propose a novel approach to determine the optimal length of protocol keywords and recover message formats of Internet protocols by maximizing the likelihood probability of message segmentation and keyword selection. A hidden semi-Markov model is presented to model the protocol message format. An affinity propagation mechanism based clustering technique is introduced to determine the message type. The proposed method is applied to identify network traffic and compare the results with existing algorithm.

Highlights

  • Network protocol specifications, describing the structure of protocol messages and regulating the behaviors of communication entities on the Internet, play an important role in addressing numbers of security or management oriented issues in several domains of computer and networking

  • Since there is no information about protocol keywords of binary protocols in published protocol specifications, we only evaluate protocol keyword extraction for text-based protocols (i.e., HTTP and SSDP)

  • The protocol keywords and message fields are inferred based on hidden semi-Markov model by maximizing the likelihood probability of message segmentation

Read more

Summary

Introduction

Network protocol specifications, describing the structure of protocol messages and regulating the behaviors of communication entities on the Internet, play an important role in addressing numbers of security or management oriented issues in several domains of computer and networking. Intrusion detection systems and firewall systems require protocol specifications to perform deep packet inspection. Security experts spy and understand the specification of command & control (C&C) protocols [1] to detect and defend the botnets. Network management administrators build up application signatures based on protocol specifications to identify protocols and tunnels in monitored network traffic. The protocol specifications are powerful tools to enable the interoperation between multiple systems based on incompatible protocols [3,4,5]

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call