Abstract

Protocol message format extraction is a principal process of automatic network protocol reverse engineering when target protocol specifications are not available. However, binary protocol reverse engineering has been a new challenge in recent years for approaches that traditionally have dealt with text-based protocols rather than binary protocols. In this study, the authors propose a novel approach called PRE-Bin that automatically extracts binary-type fields of binary protocols based on fine-grained bits. First, a silhouette coefficient is introduced into the hierarchical clustering to confirm the optimal clustering number of binary frames. Second, a modified multiple sequence alignment algorithm, in which the matching process and back-tracing rules are redesigned, is also proposed to analyse binary field features. Finally, a Bayes decision model is invoked to describe field features and determine bit-oriented field boundaries. The maximum a posteriori criterion is leveraged to complete an optimal protocol format estimation of binary field boundaries. The authors implemented a prototype system of PRE-Bin to infer the specification of binary protocols from actual traffic traces. Experimental results indicate that PRE-Bin effectively extracts binary fields and outperforms the existing algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call