Abstract

Software maintainability is a key quality attribute that determines the success of a software product. Since software maintainability is an important attribute of software quality, accurate prediction of it can help to improve overall software quality. This paper utilizes data mining of some new predictor metrics apart from traditionally used software metrics for predicting maintainability of software systems. The prediction models are constructed using static code metric datasets of four different open source software (OSS): Lucene, JHotdraw, JEdit, and JTreeview. Lucene contain 385 classes and is of 135241 lines of code (LOC) OSS, JHotdraw contain 159 classes and is of 21802 LOC OSS, JEdit contain 275 classes and is of 104053 LOC OSS and JTreeview contain 60 classes and is of 11988 LOC OSS. The metrics were collected using two different metrics extraction tools Chidamber and Kemerer Java metric (CKJM) tool and IntelliJ IDEA. Naive Bayes, Bayes Network, Logistic, MultiLayerPerceptron and Random Forest classifiers are used to identify the software modules that are difficult to maintain. Random forest models are found to be most useful in software maintainability prediction by data mining of software code metrics as random forest models have higher recall, precision and Area under curve (AUC) of ROC curve.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call