Metric-driven classification analysis

Richard W Selby,R Kent Madsen

doi:10.1007/3540547428_54

Abstract

Metric-driven classification models identify software components with user-specifiable properties, such as those likely to be fault-prone, have high development effort, or have faults in a certain class. These models are generated automatically from past metric data, and they are scalable to large systems and calibratable to different projects. These models serve as extensible integration frameworks for software metrics because they allow the addition of new metrics and integrate symbolic and numeric data from all four measurement abstractions. In our past work, we developed and evaluated techniques for generating tree-based classification models. In this paper, we investigate a technique for generating network-based classification models. The principle underlying the tree-based models is partitioning, while the principle underlying the network-based models is pattern matching. Tree-based models prune away information and can be decomposed, while network-based models retain all information and tend to be more complex. We evaluate the predictive accuracy of network-based models and compare them to the tree-based models.The evaluative study uses metric data from 16 NASA production systems ranging in size from 3000 to 112,000 source lines. The goal of the classification models is to identify the software components in the systems that had “high” development faults or effort, where “high” is defined to be in the uppermost quartile relative to past data. The models are derived from 74 candidate metrics that capture a multiplicity of information about the components: development effort, faults, changes, design style, and implementation style. A total of 1920 tree- and network-based models are automatically generated, and their predictive accuracies are compared in terms of correctness, completeness, and consistency using a non-parametric analysis of variance model. On the average, the predictions from the network-based models had 89.6% correctness, 69.1% completeness, and 79.5% consistency, while those from the tree-based models had 82.2% correctness, 56.3% completeness, and 74.5% consistency. The network-based models had statistically higher correctness and completeness than did the tree-based models, but they were not different statistically in terms of consistency. Capabilities to generate metric-driven classification models will be supported in the Amadeus measurement-driven analysis and feedback system.KeywordsClassification ModelTarget ClassSoftware MetricsAverage Classification AccuracyTraining ProjectThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Metric-driven classification analysis

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Prediction of fecal coliform using logistic regression and tree-based classification models in the North Han River, South Korea
Soo Yeon Choi ... Il Won Seo
Journal of Hydro-environment Research | VOL. 21
Soo Yeon Choi, et. al.Soo Yeon Choi ... Il Won Seo
17 Sep 2018
Journal of Hydro-environment Research | VOL. 21

Learning from examples: generation and evaluation of decision trees for software resource analysis
R.W Selby ... A.A Porter
IEEE Transactions on Software Engineering | VOL. 14
R.W Selby, et. al.R.W Selby ... A.A Porter
01 Jan 1987
IEEE Transactions on Software Engineering | VOL. 14

Predictive analytics with music: Advancing tree-based models for song rating prediction
Jiaxuan Xu
Applied and Computational Engineering | VOL. 52
Jiaxuan XuJiaxuan Xu
27 Mar 2024
Applied and Computational Engineering | VOL. 52

Ensemble data mining modeling in corrosion of concrete sewer: A comparative study of network-based (MLPNN & RBFNN) and tree-based (RF, CHAID, & CART) models
Mohammad Zounemat-Kermani ... Reinhard Hinkelmann
Advanced Engineering Informatics | VOL. 43
Mohammad Zounemat-Kermani, et. al.Mohammad Zounemat-Kermani ... Reinhard Hinkelmann
27 Dec 2019
Advanced Engineering Informatics | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Metric-driven classification analysis

Abstract

Talk to us

Similar Papers