Abstract

The exploitation of software security vulnerabilities can have severe consequences. Thus, it is crucial to devise new processes, techniques, and tools to support teams in the development of secure code from the early stages of the software development process, while potentially reducing costs and shortening the time to market. In this paper, we propose an approach that uses security evidences (e.g., software metrics, bad smells) to feed a set of trustworthiness models, which allow characterizing code from a security perspective. In practice, the goal is to identify the code units that are more prone to be vulnerable (i.e., are less trustworthy from a security perspective), thus helping developers to improve their code. A clustering-based approach is used to categorize the code units based on the combination of the scores provided by several trustworthiness models and taking into account the criticality of the code. To instantiate our proposal, we use a dataset of software metrics (e.g., CountLine, Cyclomatic Complexity, Coupling Between Objects) for files and functions of the Linux Kernel and Mozilla Firefox projects, and a set of machine learning algorithms (i.e., Random Forest, Decision Tree, SVM Linear, SVM Radial, and Xboost) to build the trustworthiness models. Results show that code that is more prone to be vulnerable can be effectively distinguished, thus demonstrating the applicability and usefulness of the proposed approach in diverse scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call