Abstract
Software Defect Prediction (SDP) models are used to identify the defect prone artifacts in a project to assist testing experts and better resource utilization. It is simple to build a SDP model for a project having historical data available. The problem arises when we want to build SDP model for a project, which has limited or no historical data available. In this paper, we have tried to find out whether data from Open-Source Software (OSS) projects is helpful in building SDP model for proprietary software having no or limited historical data. For collection of data for training SDP model a tool is developed which extract metric data from open-source software projects source codes. Three-benchmarked dataset from NASA projects are used as proprietary software dataset for which software defects are predicted. machine learning algorithms: LR, kNN, Naive Bayes, Neural Network, SVM, and Random Forest are used to build SDP models. Using popular performance indicators such as precision, recall, F-measure, etc., the performances of these six SDP models are compared. The study concluded that when SDP models are trained using data from OSS projects then they are able to predict software defects for proprietary software with greater accuracy in comparison to SDP models predicting defects for OSS projects when trained using proprietary software data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have