Cross-Platform Binary Code Homology Analysis Based on GRU Graph Embedding

Shen Wang,Xiaohui Su,Xiangzhan Yu,Xunzhi Jiang

doi:10.1155/2021/3095203

Abstract

Binary code homology analysis refers to detecting whether two pieces of binary code are compiled from the same piece of source code, which is a fundamental technique for many security applications, such as vulnerability search, plagiarism detection, and malware detection. With the increase in critical vulnerabilities in IoT devices, homology analysis is increasingly needed to perform cross-platform vulnerability searches. Existing methods for cross-platform binary code homology detection usually convert binary code to instruction sequences and do semantic embedding of the sequences as if they were natural language. However, the gap between natural language and binary code is large, and the spatial features of the binary code are easily lost by directly comparing the semantics. In this paper, we propose a GRU-based graph embedding method to compare the homology of binary functions. First, the attribute control flow graph (ACFG) is built for the assembly function, then the GRU-based graph embedding neural network is used to generate the embedding vector for the ACFG, and finally the homology of the binary code is determined by calculating the distance between the embedding vectors. The experimental results show that our method greatly improves the detection accuracy of negative samples compared with Gemini, the latest method based on graph embedding binary code similarity detection.

Highlights

With the rise and development of the Internet of ings technology, more and more embedded devices carry out network communications, and some of the security issues that exist among them have become increasingly prominent
Evaluation Indicators. e main objective of this paper is to detect if two binary codes from different platforms are homologous, so deep learning is used in this topic to solve the binary classification problem. e common evaluation metrics used in the deep learning application problem of binary classification are accuracy, true negative rate, recall, AUC, and so on
(i) True Positive (TP) means that positive cases will be predicted as positive classes (ii) False Positive (FP) means that negative cases will be predicted as positive classes (iii) True Negative (TN) means that negative cases will be predicted as negative classes (iv) False Negative (FN) means that positive cases will be predicted as negative classes

Summary

Introduction

With the rise and development of the Internet of ings technology, more and more embedded devices carry out network communications, and some of the security issues that exist among them have become increasingly prominent. Erefore, it is urgent to find a reasonable vulnerability analysis method for embedded device firmware to effectively detect the homology of similar code. Current binary code homology analysis mainly uses dynamic tracing or static analysis to obtain feature information, such as instruction sequences [1], API call sequences [2], or graph structure features [3]. Sequence information is easier to obtain than graph structure information, so most researchers conduct research on the basis of instruction sequence or API sequence, treat the sequence as a natural language, and use semantic embedding methods to obtain semantic features. Compared with graph structure information, semantic features usually have larger dimensions leading to lower detection efficiency and lose spatial features in binary code execution, such as function call relationships and basic block call relationships, which are similar when cross-platform. Xu et al [5] proposed a neural network-based approach, Gemini, which shows great

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cross-Platform Binary Code Homology Analysis Based on GRU Graph Embedding

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks

Lead the way for us

Journal: Security and Communication Networks	Publication Date: Dec 18, 2021
License type: CC BY 4.0

Similar Papers

Asteria: Deep Learning-based AST-Encoding for Cross-platform Binary Code Similarity Detection
Shouguo Yang ... Yicheng Zeng
-
Shouguo Yang, et. al.Shouguo Yang ... Yicheng Zeng
01 Jun 2021
01 Jun 2021

αDiff: cross-version binary code similarity detection with DNN
Bingchang Liu ... Feng Li
-
Bingchang Liu, et. al.Bingchang Liu ... Feng Li
03 Sep 2018
03 Sep 2018

An Inclusive Report on Robust Malware Detection and Analysis for Cross-Version Binary Code Optimizations
S Poornima, R Mahalakshmi
International Journal on Recent and Innovation Trends in Computing and Communication | VOL. 11
S Poornima, R MahalakshmiS Poornima, R Mahalakshmi
30 Oct 2023
International Journal on Recent and Innovation Trends in Computing and Communication | VOL. 11

Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection
Xiaojun Xu ... Chang Liu
-
Xiaojun Xu, et. al.Xiaojun Xu ... Chang Liu
30 Oct 2017
30 Oct 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cross-Platform Binary Code Homology Analysis Based on GRU Graph Embedding

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks