Network Communication Protocol Reverse Engineering Based on Auto-Encoder

Tianxiang Yu,Bingqing Hou,Yang Xin,Yuexin Tao,Hongliang Zhu

doi:10.1155/2022/2924479

Abstract

Network communication protocol reverse engineering is useful for network security, including protocol fuzz testing, botnet command infiltration, and service script generation. Many models have been proposed to generate field boundary, field semantic, state machine, and some other format information from network trace and program execution for text-based protocol and hybrid protocols. However, how to extract format information from network trace data for binary-based protocol still remains a challenging issue. Existing network-trace-based models focus on text-based and hybrid protocols, using tokenization and some other heuristic rules, like field identification, to perform reverse engineering, which makes it hard to apply to binary-based protocol. In this paper, we propose a whole mechanism for binary-based protocol reverse engineering based on auto-encoder models and other clustering algorithms using only network trace data. After evaluation, we set some metrics and compare our model with existing other models, showing its necessity to the field of protocol reverse engineering.

Full Text