Abstract

With the rapid development of information technology, the scale of software has increased exponentially. Binary code similarity detection technology plays an important role in many fields, such as detecting software plagiarism, vulnerabilities discovery, and copyright solution issues. Nevertheless, what cannot be ignored is that a variety of approaches to binary code semantic representation have been introduced recently, but few can catch up with existing code obfuscation techniques due to their maturing and extensive development. In order to solve this problem, we propose a new neural network model, named SROBR, which is a deep integration of natural language processing model and graph neural network. In SROBR, BERT is applied to capture sequence information of the binary code at the first place, and then GAT is utilized to capture the structural information. It combines natural language processing and graph neural network, which can capture the semantic information of binary programs while resisting obfuscation options in a more efficient way. Through binary code similarity detection task and obfuscated option classification task, the experimental results demonstrate that SROBR outperforms existing binary similarity detection methods in resisting obfuscation techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call