UniBin: Assembly semantic-enhanced binary vulnerability detection without disassembly

Li Liu,Shen Wang,Xunzhi Jiang

doi:10.1016/j.ins.2024.121605

Abstract

The widespread reuse of open-source code amplifies the impact of vulnerabilities. Current vulnerability detection methods predominantly rely on binary code similarity comparisons, which involve disassembling to obtain assembly code or control flow graphs. These methods depend on specific disassembly tools and complex preprocessing, limiting their applicability and detection speed. This paper proposes UniBin, a vulnerability detection method based on the multi-layer Transformer encoder. By employing bidirectional LM, unidirectional LM, and sequence-to-sequence LM tasks on both binary and assembly code during the pre-training phase, UniBin learns richer semantic information from binary machine code, enabling efficient similarity comparison without disassembly and mitigating the limitations of disassembly. We cross-compile 55 widely used open-source C projects as datasets. After 52 hours of pre-training and 8 hours of fine-tuning, UniBin reaches an average accuracy of 98.3% in similarity detection across compilation conditions, outperforming the state-of-the-art method. For search tasks across optimization options with a pool size of 1000, the Recall@1 metric improves by 28.2% (from 67.9% to 87.1%). UniBin eliminates dependency on specific disassembly tools and improves end-to-end binary analysis speed by over 36%. In real-world vulnerability detection tasks, UniBin detects all vulnerability functions with the lowest false positive rate of 0.16%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

UniBin: Assembly semantic-enhanced binary vulnerability detection without disassembly

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Similar Papers

BinAIV: Semantic-enhanced vulnerability detection for Linux x86 binaries
Yeming Gu ... Fei Kang
Computers & Security | VOL. 135
Yeming Gu, et. al.Yeming Gu ... Fei Kang
27 Sep 2023
Computers & Security | VOL. 135

Research and implementation of obfuscation binary code similarity detection
Yang Zhang ... Can Cui
-
Yang Zhang, et. al.Yang Zhang ... Can Cui
09 Dec 2022
09 Dec 2022

A vulnerability detection framework with enhanced graph feature learning
Jianxin Cheng ... Hanpin Wang
The Journal of Systems & Software | VOL. 216
Jianxin Cheng, et. al.Jianxin Cheng ... Hanpin Wang
01 Jun 2024
The Journal of Systems & Software | VOL. 216

MVD-HG: multigranularity smart contract vulnerability detection method based on heterogeneous graphs
Jingjie Xu ... Baiyang Ji
Cybersecurity | VOL. 7
Jingjie Xu, et. al.Jingjie Xu ... Baiyang Ji
11 Oct 2024
Cybersecurity | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

UniBin: Assembly semantic-enhanced binary vulnerability detection without disassembly

Abstract

Talk to us

Similar Papers

More From: Information Sciences