Abstract

Automatic protocol reverse engineering for application protocol is becoming more and more important for many applications such as application protocol analyzer, penetration testing, intrusion prevention and detection. However, many techniques for extracting the protocol message format specifications of unknown applications often have some limitations for little priori information or the time-consuming problem. In this paper, we present a method for automatically reverse engineering the protocol message formats of an application from its network trace, by using LDA and association analysis. The approach exploits the semantics of protocol messages without the executable code of application protocols, but focuses on the insight that the n-grams of protocol traces exhibit highly semantic information that can be leveraged for accurate protocol message format inference. Firstly, we propose the way to key words extract by utilizing the LDA model, secondly, the association analysis method is applied to constructing the feature words based on the above process. Lastly our experiments Show that the method can accurately infer message format specifications of SMTP text protocol.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.