Abstract

The Common Vulnerabilities and Exposures (CVE) system is a widely used standard for identifying and tracking known vulnerabilities in software systems. The severity of these vulnerabilities must be determined in order to prioritize mitigation efforts. However, assigning severity to a vulnerability is a challenging task that requires careful analysis of its characteristics and potential impact. Considering the vast number of vulnerabilities identified every year, it is vital to automate the severity assignment, thereby reducing manual effort. This paper proposes a novel approach for predicting the severity of vulnerabilities based on their CVE description using GPT-2, a state-of-the-art language model. The CVSS severity values distribution imbalance is addressed using oversampling and contextual data augmentation techniques. This approach leverages the large-scale language modeling capabilities of GPT-2 to automatically extract relevant features from CVE descriptions and predict the severity level of the vulnerability. The model is evaluated on a test data set of 7,765 CVEs and achieves a high accuracy of 84.2% and an F1 score of 0.82 in predicting the severity of the vulnerabilities on the test data. A comparative analysis of this approach was done against state-of-the-art methods, demonstrating the superior performance of the proposed approach. Based on the results, the proposed approach could be considered a valuable tool for quickly and accurately identifying high-severity vulnerabilities, facilitating more efficient and effective vulnerability management practices. Furthermore, this approach could be extended to other natural language processing tasks related to vulnerability analysis and management.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.