Vulnerability detection through cross-modal feature enhancement and fusion

Xiaohong Su,Wenxin Tao,Jiayuan Wan,Weining Zheng,Hongwei Wei

doi:10.1016/j.cose.2023.103341

Abstract

Software vulnerability detection is critical to computer security. Most existing vulnerability detection methods use single modal-based vulnerability detection models, which cannot effectively extract cross-modal features. To solve this problem, we propose a new multimodal deep learning based vulnerability detection method through a cross-modal feature enhancement and fusion. Firstly, we utilize a special compilation and debugging method to obtain the alignment relationship between source code statements and assembly instructions, as well as between source code variables and assembly code registers. Based on this alignment relationship and program slicing technology, we propose a cross-slicing method to generate bimodal program slices. Then, we propose a cross-modal feature enhanced code representation learning model to capture the fine-grained semantic correlation between source code and assembly code by using the co-attention mechanisms. Finally, vulnerability detection is achieved by feature level fusion of semantic features captured in fine-grained aligned source code and assembly code. Extensive experiments show that our method improves the performance of vulnerability detection compared with state-of-the-art methods. Specifically, our method achieves an accuracy of 97.4% and an F1-measure of 93.4% on the SARD dataset. An average accuracy of 95.4% and an F1-measure of 89.1% on two real-world software projects (i.e., FFmpeg and OpenSSL) is also achieved by our method, improving over SOTA method 4.5% and 2.9%.

Full Text