Active learning and effort estimation: Finding the essential content of software effort estimation data

Ekrem Kocaguneli,Jacky Keung,Ray Madachy,Tim Menzies,David Cok

doi:10.1109/tse.2012.88

Abstract

Background: Do we always need complex methods for software effort estimation (SEE)? Aim: To characterize the essential content of SEE data, i.e., the least number of features and instances required to capture the information within SEE data. If the essential content is very small, then 1) the contained information must be very brief and 2) the value added of complex learning schemes must be minimal. Method: Our QUICK method computes the euclidean distance between rows (instances) and columns (features) of SEE data, then prunes synonyms (similar features) and outliers (distant instances), then assesses the reduced data by comparing predictions from 1) a simple learner using the reduced data and 2) a state-of-the-art learner (CART) using all data. Performance is measured using hold-out experiments and expressed in terms of mean and median MRE, MAR, PRED(25), MBRE, MIBRE, or MMER. Results: For 18 datasets, QUICK pruned 69 to 96 percent of the training data (median = 89 percent). K = 1 nearest neighbor predictions (in the reduced data) performed as well as CART's predictions (using all data). Conclusion: The essential content of some SEE datasets is very small. Complex estimation methods may be overelaborate for such datasets and can be simplified. We offer QUICK as an example of such a simpler SEE method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Active learning and effort estimation: Finding the essential content of software effort estimation data

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Software Engineering

Lead the way for us

Journal: IEEE Transactions on Software Engineering	Publication Date: Aug 1, 2013
Citations: 88

Similar Papers

A survey on software effort estimation
K Usharani ... D Velmurugan
-
K Usharani, et. al.K Usharani ... D Velmurugan
01 Mar 2016
01 Mar 2016

Toward Improving the Efficiency of Software Development Effort Estimation via Clustering Analysis
Vo Van Hai ... Petr Silhavy
IEEE Access | VOL. 10
Vo Van Hai, et. al.Vo Van Hai ... Petr Silhavy
01 Jan 2021
IEEE Access | VOL. 10

The adjusted analogy-based software effort estimation based on similarity distances
Nan-Hsing Chiu ... Sun-Jen Huang
Journal of Systems and Software | VOL. 80
Nan-Hsing Chiu, et. al.Nan-Hsing Chiu ... Sun-Jen Huang
21 Jul 2006
Journal of Systems and Software | VOL. 80

An Empirical Analysis on Software Development Efforts Estimation in Machine Learning Perspective
...
-
, et. al. ...
05 Oct 2021
05 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Active learning and effort estimation: Finding the essential content of software effort estimation data

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Software Engineering