The software production sector gains advantages from automated code generating techniques, yet encounters issues related to vulnerabilities in the resulting code. This research presents a hybrid paradigm, termed GBD, for detecting vulnerabilities in software written in C and C++. It integrates Graph Convolution Network (GCN), Bidirectional Encoder Representations from Transformers (BERT), and Dropout. During Phase 2 of the GBD model, the subsequent tasks are executed concurrently: (i) obtaining node and edge features utilizing the GCN graph convolution network; (ii) deriving segment features employing the BERT model; (iii) constructing a source code profile via the Code Property Graph (CPG). Phase 3 of the model implements the Dropout strategy to mitigate overfitting. Phase 4 is the classifier that ascertains the presence of vulnerabilities in the source code. Experimental findings demonstrate the superiority of the proposed model relative to alternative methods, attaining a prediction accuracy of 61.21% for vulnerable code and 88.94% for normal files. Additionally, the classification outcomes demonstrate that with a token length of 512, the GBD model yields the most uniform results across all metrics: Accuracy (86.65%), Precision (38.59%), Recall (66.21%), and F1-score (48.76%). This corresponds with our analysis of the Verum experimental dataset, indicating that over 70% of the source code files have lengths exceeding 256 but less than 512. Furthermore, the GBD model exhibits strong performance across both individual and multiple datasets. For example, in the Verum dataset, the GBD model surpasses five alternative methodologies—REVEAL [1], Russell [2], VulDeePecker [3], SySeVR [4], and Devign [5] - by 4% in Accuracy and between 15% and 57% in Precision, Recall, and F1-score. In comparison to SySeVR [4], the GBD model exceeds it by 3% to 25% across all metrics. In comparison to Devign [5], GBD achieves improvements of 5% to 39% in Precision, Recall, and F1-score. Upon assessment of the FFmpeg+Qume dataset, the GBD model attains an Accuracy improvement ranging from 0.2% to 10% above all other studies. In terms of precision, GBD surpasses alternative methods by 0.3% to 9%. In terms of Recall, GBD is marginally worse than REVEAL by 1.5%, although surpasses all other methodologies by 10% to over 31%. In terms of F1-score, GBD is 0.3% inferior to REVEAL but surpasses other studies by 7% to 30%. The results indicate that the GBD model is effective on both individual and multiple datasets
Read full abstract