Bug Prediction Model using Code Smells

Gihan M Ubayawardana,Damith D Karunaratna

doi:10.1109/icter.2018.8615550

Abstract

The term ‘Code Smells’ was first coined in the book Refactoring: Improving the design of existing code by M Fowler in 1999. Code smells are poor design choices which have the potential to cause an error or failure in a computer program. The objective of this study is to use code smells as a candidate metric to build a bug prediction model. In this study we have built a bug prediction model using both source code metrics and code smell based metrics proposed in the literature. We used Naive Bayes, Random Forest and Logistic Regression as our candidate algorithms to build the model. We have trained our model against multiple versions of 13 different Java based open source projects. The trained model was used to predict bugs in a particular version of a project, within a particular project and among different projects. We were able to demonstrate, that code smell based metrics can significantly improve the accuracy of a bug prediction model when integrated with source code metrics. Random Forest algorithm based model showed higher accuracy within a version, within a project and among projects when compared to other algorithms.

Full Text