Data Mining Techniques for Software Quality Prediction in Open Source Software

M Litmaath,A Forti,O Smirnova,L Betev,P Hristov,Marco Canaparo,Elisabetta Ronchieri

doi:10.1051/epjconf/201921405007

M Litmaath, A Forti + Show 5 more

Open Access

https://doi.org/10.1051/epjconf/201921405007

Copy DOI

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2019
Citations: 3	License type: CC BY 4.0

Affiliation: INFN Sezione di Bologna

Abstract

Software quality monitoring and analysis are among the most productive topics in software engineering research. Their results may be effectively employed by engineers during software development life cycle. Open source software constitutes a valid test case for the assessment of software characteristics. The data mining approach has been proposed in literature to extract software characteristics from software engineering data. This paper aims at comparing diverse data mining techniques (e.g., derived from machine learning) for developing effective software quality prediction models. To achieve this goal, we tackled various issues, such as the collection of software metrics from open source repositories, the assessment of prediction models to detect software issues and the adoption of statistical methods to evaluate data mining techniques. The results of this study aspire to identify the data mining techniques that perform better amongst all the ones used in this paper for software quality prediction models.

Highlights

The software used in scientific environment is a rich mixture of in-house software and software taken from the large open source community [1]
This study aims at providing an initial comparative performance analysis of different data mining techniques for software quality prediction through a well-documented methodology
Some metrics of the suite are: Weighted Method Per Class, which measures the number of methods which is in each class; Depth of Inheritance Tree, which measures the distance of the longest path from a class to the root in the inheritance tree; Number Of Children, which measures the number of classes that are direct descendants of each class

Summary

Introduction

The software used in scientific environment (e.g. the HEP software) is a rich mixture of in-house software and software taken from the large open source community [1]. The aforementioned data constitute the input of one or more data mining techniques (such as Random Forest, Bagging and Support Vector Machine). The output of these techniques help software engineers to mine patterns and detect violation of patterns, which are likely to be defects. Data are converted into knowledge that can help in conducting the most common software engineering tasks: programming, defect detection, testing and maintenance [5]. There are many different studies that deal with software quality prediction and data mining techniques. This study aims at providing an initial comparative performance analysis of different data mining techniques for software quality prediction through a well-documented methodology.

Research Methodology

Study Setup

Initial Assessment

Conclusion

A Glossary

B Datasets

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data Mining Techniques for Software Quality Prediction in Open Source Software

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Similar Papers

Performance Analysis of Datamining Algorithms for Software Quality Prediction
N Gayatri ... R Chitra
-
N Gayatri, et. al.N Gayatri ... R Chitra
01 Jan 2009
01 Jan 2009

Implication of Soft Computing and Machine Learning Method for Software Quality, Defect and Model Prediction
Anurag Sinha ... Devansh Kashyap
-
Anurag Sinha, et. al.Anurag Sinha ... Devansh Kashyap
07 Oct 2022
07 Oct 2022

A Novel Method for Early Software Quality Prediction Based on Support Vector Machine
Fei Xing ... M.R Lyu
-
Fei Xing, et. al. Fei Xing ... M.R Lyu
08 Nov 2005
08 Nov 2005

Software Metrics Data Clustering for Quality Prediction
Bingbing Yang ... Xin Zheng
-
Bingbing Yang, et. al.Bingbing Yang ... Xin Zheng
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Mining Techniques for Software Quality Prediction in Open Source Software

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences