BiN: A Two-Level Learning-Based Bug Search for Cross-Architecture Binary

Hao Wu,Fei Kang,Xiaobing Xiong,Hui Shu

doi:10.1109/access.2019.2953173

Abstract

With the popularity of IoT (Internet of Things) devices, the security risks of these devices are increasing. However, due to the multisource heterogeneity of IoT devices, there are significant differences between the vulnerability detection of the Internet of Things and the PC-based vulnerability search method. Therefore, determining how to accurate search for vulnerabilities in large-scale cross-platform binary executable files is an urgent problem to be solved. At present, the solution to this problem mostly calculates code similarities by generating a CFG (control flow graph) from binary code, but due to the choice of architecture, OS (operating system) or compilation options, the same source code will be compiled into different assembly codes. The performance of existing vulnerability search methods for cross-architecture binaries has been challenged. To alleviate the vast differences in the assembly codes caused by different compilation scenarios, this paper proposes a cross-platform large-scale binary vulnerability search method based on two-level feature semantic learning. The contribution is that we have defined a new functional structured signature method to mitigate the massive grammatical and structural differences of binary files caused by different compilation environments. Moreover, we reasonably integrate the hierarchical model of Structure2Vec and GAT (graph attention network) and implement training from the internal control flow characteristics of the function and the call relationship between functions to obtain a more accurate functional semantic expression.

Highlights

Using open source code or using third-party libraries is a common approach in the development process, and the same vendor often reuses code, which provides fertile ground for the generation and survival of vulnerabilities
OVERVIEW We propose a vulnerability search method based on hierarchical semantic learning [44] and implement a prototype for verification experiments
Most of the existing graph-based function similarity calculation methods extract features directly from the function CFG, PDG(program dependency graph) [8], AST(abstract syntax tree) [9], etc., but by using different choices of architecture, OS or compilation options, the same source code may be compiled into assembly code with different structures, and the function features extracted by these methods cannot accurately express the function semantics [35]

Summary

INTRODUCTION

Using open source code or using third-party libraries is a common approach in the development process, and the same vendor often reuses code, which provides fertile ground for the generation and survival of vulnerabilities. D. CONTRIBUTIONS In summary, our main contributions are the following: 1) Guided by manual vulnerability search, we propose a solution to reduce the impact of different compilation environments on function binaries; 2) We attach the function data flow to CFG and designed a model-oriented GA to select suitable features to obtain more complete semantics; 3) We apply a artificial neural network GAT to construct a network architecture based on the attention mechanism of neighbor nodes, and consider the call relationships between functions and generate richer semantic representations; 4) We establish a hierarchical model to fused the GAT [5] model and the Structure2Vec [6] model, and train them together from the intra-function characteristic and the call relationship between functions to achieve a more accurate functional similarity comparison; 5) We implemented a prototype called BiN. Our evaluation shows that BiN can achieve higher AUC than other stateof-the-art graphics-based matching methods in the test set built by OpenSSL and BusyBox; 6) We tested our prototypes on a larger data set, and the results showed that our method implementation was accurate and efficient enough to handle real-world vulnerability detection efforts

BACKGROUND

SEMANTIC LEARNING PREDICTOR

INTRA-FUNCTION FEATURE LEARNING MODEL

EVALUATION

ACCURACY OF VULNERABILITY SEARCH

RELATED WORK

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2019
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

BiN: A Two-Level Learning-Based Bug Search for Cross-Architecture Binary

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

BGSD: A SBERT and GAT-based Service Discovery Framework for Heterogeneous Distributed IoT
Hanqiang Deng ... Jialong Gao
Computer Networks and ISDN Systems | VOL. 220
Hanqiang Deng, et. al.Hanqiang Deng ... Jialong Gao
23 Nov 2022
Computer Networks and ISDN Systems | VOL. 220

Technique for IoT malware detection based on control flow graph analysis
Kira Bobrovnikova ... Bohdan Savenko
RADIOELECTRONIC AND COMPUTER SYSTEMS | VOL. -
Kira Bobrovnikova, et. al.Kira Bobrovnikova ... Bohdan Savenko
23 Feb 2022
RADIOELECTRONIC AND COMPUTER SYSTEMS | VOL. -

Connecting food supply chains
-
Food Science and Technology | VOL. 36
--
01 Sep 2022
Food Science and Technology | VOL. 36

Secure multi-party data communications in cloud augmented IoT environment
Xueqing Huang ... Nirwan Ansari
-
Xueqing Huang, et. al.Xueqing Huang ... Nirwan Ansari
01 May 2017
01 May 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

BiN: A Two-Level Learning-Based Bug Search for Cross-Architecture Binary

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions