Abstract
Abstract Protocol reverse engineering is crucial in normative verification, and malware behavior analysis and vulnerability discovery. However, uncovering the structural features of binary protocols concealed within dense data representations remains a significant challenge. Accurately identifying keyword segments associated with message types is a prerequisite for meaningful semantic analysis and protocol state machine reduction. In this work, we introduce a novel approach for inferring keywords from binary protocols based on probabilistic statistics. Our method in terms of Byte employs heuristic rules to filter offset positions that are clearly unrelated to message types. We further filter candidate Byte-offsets utilizing constraint relations and provide the probabilistic ranking of each offset as the keyword segment. To enhance the reliability of keyword segment inference, we utilize the Monte Carlo algorithm to assess the difference between message clustering with candidate Byte-offset and random message clustering, and reorder candidate offsets according to the results. Then we can observe optimal values from both orderings and present the ultimate inference results. Experimental results demonstrate that our method excels in the accuracy of keyword segments identification compared with previous techniques.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.