CD-VulD: Cross-Domain Vulnerability Discovery Based on Deep Domain Adaptation

Shigang Liu,Guanjun Lin,Yang Xiang,Jun Zhang,Olivier De Vel,Lizhen Qu,Paul Montague

doi:10.1109/tdsc.2020.2984505

Abstract

A major cause of security incidents such as cyber attacks is rooted in software vulnerabilities. These vulnerabilities should ideally be found and fixed before the code gets deployed. Machine learning-based approaches achieve state-of-the-art performance in capturing vulnerabilities. These methods are predominantly supervised. Their prediction models are trained on a set of ground truth data where the training data and test data are assumed to be drawn from the same probability distribution. However, in practice, the test data often differs from the training data in terms of distribution because they are from different projects or they differ in the types of vulnerability. In this article, we present a new system for Cross Domain Software Vulnerability Discovery (CD-VulD) using deep learning (DL) and domain adaptation (DA). We employ DL because it has the capacity of automatically constructing high-level abstract feature representations of programs, which are likely of more cross-domain useful than the handcrafted features driven by domain knowledge. The divergence between distributions is reduced by learning cross-domain representations. First, given software program representations, CD-VulD converts them into token sequences and learns the token embeddings for generalization across tokens. Next, CD-VulD employs a deep feature model to build abstract high-level presentations based on those sequences. Then, the metric transfer learning framework (MTLF) technique is employed to learn cross-domain representations by minimizing the distribution divergence between the source domain and the target domain. Finally, the cross-domain representations are used to build a classifier for vulnerability detection. Experimental results show that CD-VulD outperforms the state-of-the-art vulnerability detection approaches by a wide margin. We make the new datasets publicly available so that our work is replicable and can be further improved.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CD-VulD: Cross-Domain Vulnerability Discovery Based on Deep Domain Adaptation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Dependable and Secure Computing

Lead the way for us

Journal: IEEE Transactions on Dependable and Secure Computing	Publication Date: Apr 3, 2020
Citations: 34

Similar Papers

Cross-domain vulnerability detection using graph embedding and domain adaptation
Xin Li ... Yuling Chen
Computers & Security | VOL. 125
Xin Li, et. al.Xin Li ... Yuling Chen
17 Nov 2022
Computers & Security | VOL. 125

A Stacked Auto-Encoder Based Partial Adversarial Domain Adaptation Model for Intelligent Fault Diagnosis of Rotating Machines
Zhao-Hua Liu ... Chang-Tong Wang
IEEE Transactions on Industrial Informatics | VOL. 17
Zhao-Hua Liu, et. al.Zhao-Hua Liu ... Chang-Tong Wang
15 Dec 2020
IEEE Transactions on Industrial Informatics | VOL. 17

Visual-Depth Matching Network: Deep RGB-D Domain Adaptation With Unequal Categories.
Ziyun Cai ... Xiao-Yuan Jing
IEEE transactions on cybernetics | VOL. 52
Ziyun Cai, et. al.Ziyun Cai ... Xiao-Yuan Jing
17 Nov 2020
IEEE transactions on cybernetics | VOL. 52

Class Consistency Driven Unsupervised Deep Adversarial Domain Adaptation
Sayan Rakshit ... Ushasi Chaudhuri
-
Sayan Rakshit, et. al.Sayan Rakshit ... Ushasi Chaudhuri
01 Jun 2019
01 Jun 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CD-VulD: Cross-Domain Vulnerability Discovery Based on Deep Domain Adaptation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Dependable and Secure Computing