Abstract

Machine learning-based fine-grained vulnerability detection is an important technique for locating vulnerable statements, which assists engineers in efficiently analyzing and fixing the vulnerabilities. However, due to insufficient code representations, code embeddings, and neural network design, current methods suffer low vulnerability localization performance. In this paper, we propose to address these shortcomings by presenting SlicedLocator, a novel fine-grained code vulnerability detection model that is trained in a dual-grained manner and can predict both program-level and statement-level vulnerabilities. We design the sliced dependence graph, a new code representation that not only preserves rich interprocedural relations but also eliminates vulnerability-irrelevant statements. We create attention-based code embedding networks that are trained with the entire model to extract vulnerability-aware code features. In addition, we present a new LSTM-GNN model as a fusion of semantic modeling and structural modeling. Experiment results on a large-scale C/C++ vulnerability dataset reveal that SlicedLocator outperforms state-of-the-art machine learning-based vulnerability detectors, especially in terms of localization metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call