CVE Severity Prediction From Vulnerability Description - A Deep Learning Approach

Manjunatha A,Kethan Kota,Anoop S Babu,Sree Vivek S

doi:10.1016/j.procs.2024.04.294

Abstract

The Common Vulnerabilities and Exposures (CVE) system is a widely used standard for identifying and tracking known vulnerabilities in software systems. The severity of these vulnerabilities must be determined in order to prioritize mitigation efforts. However, assigning severity to a vulnerability is a challenging task that requires careful analysis of its characteristics and potential impact. Considering the vast number of vulnerabilities identified every year, it is vital to automate the severity assignment, thereby reducing manual effort. This paper proposes a novel approach for predicting the severity of vulnerabilities based on their CVE description using GPT-2, a state-of-the-art language model. The CVSS severity values distribution imbalance is addressed using oversampling and contextual data augmentation techniques. This approach leverages the large-scale language modeling capabilities of GPT-2 to automatically extract relevant features from CVE descriptions and predict the severity level of the vulnerability. The model is evaluated on a test data set of 7,765 CVEs and achieves a high accuracy of 84.2% and an F1 score of 0.82 in predicting the severity of the vulnerabilities on the test data. A comparative analysis of this approach was done against state-of-the-art methods, demonstrating the superior performance of the proposed approach. Based on the results, the proposed approach could be considered a valuable tool for quickly and accurately identifying high-severity vulnerabilities, facilitating more efficient and effective vulnerability management practices. Furthermore, this approach could be extended to other natural language processing tasks related to vulnerability analysis and management.

Full Text