Abstract
As the number of vulnerability databases established by various nations continues to grow, they have accumulated hundreds of thousands of security vulnerability reports, which play a crucial role in protecting system security. However, many databases are found to lack essential information, contain inaccuracies, or are inconsistent with others. Despite these challenges, the importance of vulnerability databases continues to grow. Current research on vulnerability databases is limited to software version and vulnerability reproduction, but the software names, an essential component of vulnerability databases, have not been extensively studied. Understanding the consistency of software names in different vulnerability databases is crucial for improving the accuracy of vulnerability databases.The paper introduces VERNIER, an automated method for measuring inconsistencies in 789,954 sets of software names from nine security vulnerability databases (including CVE and NVD) from 1999 to 2019. We utilized a named entity recognition (NER) model with exceptional accuracy (99.5%) and F1 score (95.1%) to extract software names from unstructured Chinese and English vulnerability reports. VERNIER assesses software names' inconsistency at character and semantic levels. The results indicate that inconsistent software names are prevalent in vulnerability databases. The average of the exact matching rate between NVD and other mainstream databases, such as CVE, is only 20.3% at the character-level and 43.3% at the semantic-level. We also discover internal inconsistencies between the structured and unstructured software names inside the same vulnerability database (e.g., NVD). To mitigate the inconsistency, we implement an alert tool using inconsistencies to detect incorrect software names. This tool can effectively warn and correct software names.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.