Locating Vulnerability in Binaries Using Deep Neural Networks

Runhao Li,Chao Feng,Chen Zhang,Xing Zhang,Chaojing Tang

doi:10.1109/access.2019.2942043

Abstract

Binary fault localization is important for vulnerability analysis, but many current techniques face problems in locating vulnerability accurately and effectively, especially for real-world programs. In this paper, we propose a novel gradient-guided vulnerability locating method named DeepVL, which leverages deep neural networks to diagnose the root cause of weakness in binaries and provide guidance information for further analysis. DeepVL collects sufficient amounts of crashed execution traces and normal execution traces as input of the constructed neural networks. Based on trained neural network, DeepVL calculates the gradient information for each basic block in traces and filter out the vulnerable basic blocks according to corresponding gradients. To demonstrate the applicability of DeepVL, we perform plenty of experiments on different datasets. According to the experimental results on Common Weakness Enumeration (CWE) dataset, DeepVL could locate different types of vulnerabilities accurately and effectively, with recall@10 reaching 96.9% and precision@10 reaching 70.1%. Additionally, the results on Cyber Grand Challenge (CGC) program and LibTIFF 4.0.10 show that DeepVL is capable of locating vulnerable basic blocks in large-scale programs. As a fault localization tool, DeepVL could greatly reduce the manual effort of finding vulnerabilities in binaries.

Highlights

Binary program vulnerability is an important research topic of software security, and a hotspot of cyber security
We propose a novel method based on deep neural networks
2) For the first time, this paper proposes a novel method combining gradient information of neural networks (NNs) to diagnose the locations of bugs in binaries

Summary

Introduction

Binary program vulnerability is an important research topic of software security, and a hotspot of cyber security. A large number of software vulnerabilities are discovered every day, causing serious harm to computer systems and program security [1]. Locate and eliminate vulnerabilities efficiently have always been the focus in the field of software security. Locating vulnerabilities in binary programs is quite tricky and timeconsuming. Researchers have to spend a lot of time on finding and fixing vulnerabilities in programs every year, which wastes much financial resources [2], [3]. Most fault localization techniques debug with tools manually or rely on human vulnerable experiences to locate weakness. Since programs are larger and much more complex nowadays, making it more difficult to perform manual

Objectives

Methods

Results