REPD: Source code defect prediction as anomaly detection

Petar Afric,Lucija Sikic,Adrian Satja Kurdija,Marin Silic

doi:10.1016/j.jss.2020.110641

Abstract

In this paper, we present a novel approach for within-project source code defect prediction. Since defect prediction datasets are typically imbalanced, and there are few defective examples, we treat defect prediction as anomaly detection. We present our Reconstruction Error Probability Distribution (REPD) model which can handle point and collective anomalies. We compare it on five different traditional code feature datasets against five models: Gaussian Naive Bayes, logistic regression, k-nearest-neighbors, decision tree, and Hybrid SMOTE-Ensemble. In addition, REPD is compared on 24 semantic features datasets against previously mentioned models. In order to compare the performance of competing models, we utilize F1-score measure. By using statistical means, we show that our model produces significantly better results, improving F1-score up to 7.12%. Additionally, REPD’s robustness to dataset imbalance is analyzed by creating defect undersampled and non-defect oversampled datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

REPD: Source code defect prediction as anomaly detection

Abstract

Talk to us

Similar Papers

More From: The Journal of Systems & Software

Lead the way for us

Journal: The Journal of Systems & Software	Publication Date: May 15, 2020
Citations: 9

Similar Papers

REPD: Source Code Defect Prediction As Anomaly Detection
Petar Afric ... Marin Silic
-
Petar Afric, et. al.Petar Afric ... Marin Silic
01 Jul 2019
01 Jul 2019

Evaluating defect prediction approaches using a massive set of metrics
Xiao Xuan ... David Lo
-
Xiao Xuan, et. al.Xiao Xuan ... David Lo
13 Apr 2015
13 Apr 2015

A study of subgroup discovery approaches for defect prediction
Daniel Rodriguez ... Rachel Harrison
Information and Software Technology | VOL. 55
Daniel Rodriguez, et. al.Daniel Rodriguez ... Rachel Harrison
16 May 2013
Information and Software Technology | VOL. 55

A feature matching and transfer approach for cross-company defect prediction
Qiao Yu ... Yanmei Zhang
Journal of Systems and Software | VOL. 132
Qiao Yu, et. al.Qiao Yu ... Yanmei Zhang
24 Jun 2017
Journal of Systems and Software | VOL. 132

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

REPD: Source code defect prediction as anomaly detection

Abstract

Talk to us

Similar Papers

More From: The Journal of Systems & Software