Abstract

Software quality monitoring and analysis are among the most productive topics in software engineering research. Their results may be effectively employed by engineers during software development life cycle. Open source software constitutes a valid test case for the assessment of software characteristics. The data mining approach has been proposed in literature to extract software characteristics from software engineering data. This paper aims at comparing diverse data mining techniques (e.g., derived from machine learning) for developing effective software quality prediction models. To achieve this goal, we tackled various issues, such as the collection of software metrics from open source repositories, the assessment of prediction models to detect software issues and the adoption of statistical methods to evaluate data mining techniques. The results of this study aspire to identify the data mining techniques that perform better amongst all the ones used in this paper for software quality prediction models.

Highlights

  • The software used in scientific environment is a rich mixture of in-house software and software taken from the large open source community [1]

  • This study aims at providing an initial comparative performance analysis of different data mining techniques for software quality prediction through a well-documented methodology

  • Some metrics of the suite are: Weighted Method Per Class, which measures the number of methods which is in each class; Depth of Inheritance Tree, which measures the distance of the longest path from a class to the root in the inheritance tree; Number Of Children, which measures the number of classes that are direct descendants of each class

Read more

Summary

Introduction

The software used in scientific environment (e.g. the HEP software) is a rich mixture of in-house software and software taken from the large open source community [1]. The aforementioned data constitute the input of one or more data mining techniques (such as Random Forest, Bagging and Support Vector Machine). The output of these techniques help software engineers to mine patterns and detect violation of patterns, which are likely to be defects. Data are converted into knowledge that can help in conducting the most common software engineering tasks: programming, defect detection, testing and maintenance [5]. There are many different studies that deal with software quality prediction and data mining techniques. This study aims at providing an initial comparative performance analysis of different data mining techniques for software quality prediction through a well-documented methodology.

Research Methodology
Study Setup
Initial Assessment
Conclusion
A Glossary
B Datasets
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call