Learning from examples: generation and evaluation of decision trees for software resource analysis

R.W Selby,A.A Porter

doi:10.1109/32.9061

Abstract

A general solution method for the automatic generation of decision (or classification) trees is investigated. The approach is to provide insights through in-depth empirical characterization and evaluation of decision trees for one problem domain, specifically, that of software resource data analysis. The purpose of the decision trees is to identify classes of objects (software modules) that had high development effort, i.e. in the uppermost quartile relative to past data. Sixteen software systems ranging from 3000 to 112000 source lines have been selected for analysis from a NASA production environment. The collection and analysis of 74 attributes (or metrics), for over 4700 objects, capture a multitude of information about the objects: development effort, faults, changes, design style, and implementation style. A total of 9600 decision trees are automatically generated and evaluated. The analysis focuses on the characterization and evaluation of decision tree accuracy, complexity, and composition. The decision trees correctly identified 79.3% of the software modules that had high development effort or faults, on the average across all 9600 trees. The decision trees generated from the best parameter combinations correctly identified 88.4% of the modules on the average. Visualization of the results is emphasized, and sample decision trees are included. >

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning from examples: generation and evaluation of decision trees for software resource analysis

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Software Engineering

Lead the way for us

Journal: IEEE Transactions on Software Engineering	Publication Date: Jan 1, 1988
Citations: 252

Similar Papers

Flexible Multidiscretizer Based on Measures which are Used in Induction of Decision Trees
Cezary Ko•Mider
-
Cezary Ko•MiderCezary Ko•Mider
01 Jan 2002
01 Jan 2002

Metric-driven classification analysis
Richard W Selby ... R Kent Madsen
-
Richard W Selby, et. al.Richard W Selby ... R Kent Madsen
01 Jan 1991
01 Jan 1991

An initial comparison on noise resisting between crisp and fuzzy decision trees
Juan Sun ... Xi-Zhao Wang
-
Juan Sun, et. al. Juan Sun ... Xi-Zhao Wang
01 Jan 2004
01 Jan 2004

Big Data with Decision Tree Induction
Shabnam Sabah ... Sara Zumerrah Binte Anwar
-
Shabnam Sabah, et. al.Shabnam Sabah ... Sara Zumerrah Binte Anwar
01 Aug 2019
01 Aug 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning from examples: generation and evaluation of decision trees for software resource analysis

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Software Engineering