A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation

Leandro L. Minku

doi:10.1007/s10664-019-09686-w

Abstract

Software effort estimation is an online supervised learning problem, where new training projects may become available over time. In this scenario, the Cross-Company (CC) approach Dycom can drastically reduce the number of Within-Company (WC) projects needed for training, saving their collection cost. However, Dycom requires CC projects to be split into subsets. Both the number and composition of such subsets can affect Dycom’s predictive performance. Even though clustering methods could be used to automatically create CC subsets, there are no procedures for automatically tuning the number of clusters over time in online supervised scenarios. This paper proposes the first procedure for that. An investigation of Dycom using six clustering methods and three automated tuning procedures is performed, to check whether clustering with automated tuning can create well performing CC splits. A case study with the ISBSG Repository shows that the proposed tuning procedure in combination with a simple threshold-based clustering method is the most successful in enabling Dycom to drastically reduce (by a factor of 10) the number of required WC training projects, while maintaining (or even improving) predictive performance in comparison with a corresponding WC model. A detailed analysis is provided to understand the conditions under which this approach does or does not work well. Overall, the proposed online supervised tuning procedure was generally successful in enabling a very simple threshold-based clustering approach to obtain the most competitive Dycom results. This demonstrates the value of automatically tuning hyperparameters over time in a supervised way.

Highlights

Software Effort Estimation (SEE) is the process of estimating the effort required to develop a software project
The second aim of this paper is to investigate whether clustering methods are successful in avoiding poor CC splits for Dycom when used in combination with (1) the proposed online supervised hyperparameter tuning procedure, and (2) existing offline unsupervised procedures for tuning the number of CC subsets
It is able to reduce the amount of WC training projects required for training SEE models while maintaining or improving the predictive performance of a corresponding WC approach

Summary

Introduction

Software Effort Estimation (SEE) is the process of estimating the effort required to develop a software project. The use of machine learning for creating SEE models based on data describing completed projects has been studied for many years (Boehm 1981; Jørgensen and Shepperd 2007; Dejaeger et al 2012; Sarro et al 2016; Song et al 2018). Such SEE models could form useful tools to help experts to perform and/or re-think their effort estimations. These estimations can be used to inform many important project decisions, such as project bidding, requirements selection, task allocation, etc. Such methods are widely advocated by professional societies and certification programs (see for example http://www.iceaaonline.com/, tiny.cc/gox9xy and tiny.cc/hmx9xy)

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Empirical Software Engineering	Publication Date: Feb 26, 2019
Citations: 29	License type: open-access

R Discovery Prime

R Discovery Prime

A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Empirical Software Engineering

Lead the way for us

Similar Papers

Multi-stream online transfer learning for software effort estimation: is it necessary?
Leandro L Minku
-
Leandro L MinkuLeandro L Minku
19 Aug 2021
19 Aug 2021

Clustering Dycom
Leandro L Minku ... Siqing Hou
-
Leandro L Minku, et. al.Leandro L Minku ... Siqing Hou
08 Nov 2017
08 Nov 2017

On the Terms Within- and Cross-Company in Software Effort Estimation
Leandro L Minku
-
Leandro L MinkuLeandro L Minku
09 Sep 2016
09 Sep 2016

How to make best use of cross-company data in software effort estimation?
Leandro L Minku ... Xin Yao
-
Leandro L Minku, et. al.Leandro L Minku ... Xin Yao
31 May 2014
31 May 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Empirical Software Engineering