Time for Addressing Software Security Issues: Prediction Models and Impacting Factors

Lotfi Ben Othmane,Golriz Chehrazi,Eric Bodden,Achim D Brucker,Petar Tsalovski

doi:10.1007/s41019-016-0019-8

Abstract

Finding and fixing software vulnerabilities have become a major struggle for most software development companies. While generally without alternative, such fixing efforts are a major cost factor, which is why companies have a vital interest in focusing their secure software development activities such that they obtain an optimal return on this investment. We investigate, in this paper, quantitatively the major factors that impact the time it takes to fix a given security issue based on data collected automatically within SAP’s secure development process, and we show how the issue fix time could be used to monitor the fixing process. We use three machine learning methods and evaluate their predictive power in predicting the time to fix issues. Interestingly, the models indicate that vulnerability type has less dominant impact on issue fix time than previously believed. The time it takes to fix an issue instead seems much more related to the component in which the potential vulnerability resides, the project related to the issue, the development groups that address the issue, and the closeness of the software release date. This indicates that the software structure, the fixing processes, and the development groups are the dominant factors that impact the time spent to address security issues. SAP can use the models to implement a continuous improvement of its secure software development process and to measure the impact of individual improvements. The development teams at SAP develop different types of software, adopt different internal development processes, use different programming languages and platforms, and are located in different cities and countries. Other organizations, may use the results—with precaution—and be learning organizations.

Highlights

IntroductionBefore and after a release, is one of the most costly and unproductive software engineering activities
Fixing vulnerabilities, before and after a release, is one of the most costly and unproductive software engineering activities
This study showed that the models generated using the linear regression (LR), Recursive PARTitioning (RPART), and Neural Network Regression (NNR) methods have conflicting accuracy measurements in predicting the issue fix time

Summary

Introduction

Before and after a release, is one of the most costly and unproductive software engineering activities. Software development companies have an interest to determine the factors that impact the effort, and the time it takes to fix security issues, in particular to:. Regression models relate the quantity of a response factor, i.e., dependent variable to the independent variables. In general, a regression model is assumed to be good, if it predicts responses close to the actual values observed in reality. We provide background about the regression methods, model’s performance metrics, and a metric for measuring the relative importance of the prediction factors used in the models

Objectives

Methods

Results

Conclusion