Abstract

Machine learning models are being utilized as applicable tools for source code vulnerability detection. However, the usage of models and techniques still vary in usability and prediction. In generalizing models to a more natural language approach, researchers have opted to train models on source code to identify existing and potential vulnerabilities. Exploratory research has been performed by treating source code as plain text. Additionally, models have been trained and tested within and across domains, such as projects. Based on a recent empirical study, text-based models are not effective across-domains. In this study, we examine results from this text-based approach to machine learning models. By providing a statistical analysis of data used and prediction results provided, we explore why the approach may not be sufficient for vulnerability prediction across domains.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.