Consolidating evidence based studies in software cost/effort estimation — A tertiary study

Sreekumar P Pillai,T Radharamanan,S.D Madhukumar

doi:10.1109/tencon.2017.8227974

Abstract

Software Effort Estimation is key to the success of any project since all downstream activities such as planning, budgeting, developing and Monitoring cannot be executed without clarity on the scope of the activity that needs to be performed. This is a tertiary study that follows the Systematic Literature Review (SLR) process as put forth by Kitchenham in her seminal paper, based on five criteria: estimation technique, estimation accuracy, type of dataset and independent variables used in empirical research on effort estimation. Our study covering 820 Primary Studies through 14 SLRs, shows that Software Effort Estimation studies focus more on statistical techniques and Machine Learning is taking precedence in comparison to the others; whereas Expert Judgement is preferred by the industry due to its intuitiveness. There is a need for models that are simple to understand and global, due to the distributed nature of software development. The studies are inconclusive about the accuracy benefits of using a within company dataset vs.external datasets. Machine learning techniques such FL and GA in combination with Analogy methods generate more accurate estimates. There is increasing consensus on the use of Mean Magnitude of Relative Error (MMRE), Median Magnitude of Relative Error (MdMRE) and Prediction Pred (25%) as the accuracy metric. 78% of the Primary Studies reported accuracy using MMRE. The best MMRE reported is in the range of 7 to 75. ISBSG (International Software Benchmarking Standards Group) and Desharnais datasets with 27% and 17% usage respectively are the most widely used datasets in empirical studies on effort estimation. Fewer than 20 independent variables account for more than 90% impact of variables in empirical analysis on Software effort estimation.

Full Text