VulEye: A Novel Graph Neural Network Vulnerability Detection Approach for PHP Application

Chun Lin,Yong Fang,Zhonglin Liu,Yijia Xu

doi:10.3390/app13020825

Abstract

Following advances in machine learning and deep learning processing, cyber security experts are committed to creating deep intelligent approaches for automatically detecting software vulnerabilities. Nowadays, many practices are for C and C++ programs, and methods rarely target PHP application. Moreover, many of these methods use LSTM (Long Short-Term Memory) but not GNN (Graph Neural Networks) to learn the token dependencies within the source code through different transformations. That may lose a lot of semantic information in terms of code representation. This article presents a novel Graph Neural Network vulnerability detection approach, VulEye, for PHP applications. VulEye can assist security researchers in finding vulnerabilities in PHP projects quickly. VulEye first constructs the PDG (Program Dependence Graph) of the PHP source code, slices PDG with sensitive functions contained in the source code into sub-graphs called SDG (Sub-Dependence Graph), and then makes SDG the model input to train with a Graph Neural Network model which contains three stack units with a GCN layer, Top-k pooling layer, and attention layer, and finally uses MLP (Multi-Layer Perceptron) and softmax as a classifier to predict if the SDG is vulnerable. We evaluated VulEye on the PHP vulnerability test suite in Software Assurance Reference Dataset. The experiment reports show that the best macro-average F1 score of the VulEye reached 99% in the binary classification task and 95% in the multi-classes classification task. VulEye achieved the best result compared with the existing open-source vulnerability detection implements and other state-of-art deep learning models. Moreover, VulEye can also locate the precise area of the flaw, since our SDG contains code slices closely related to vulnerabilities with a key triggering sensitive/sink function.

Full Text