Abstract

An anti-pattern is a common response to a recurring problem that is usually ineffective and risks being highly counterproductive. In this work, we empirically investigate the association between the occurrence of four different types of anti-patterns and source code metrics. SMOTE is being used for data sampling as the dataset considered is imbalanced. Principle component analysis and Rough set analysis are applied for feature extraction and selection. The features selected from this two techniques along with the significant features(SIGF) are considered as input for building the predictive models for the detection of antipatterns. The effectiveness of these techniques are evaluated using Logistic Regression(LOGR), Decision Tree(DT) and Least Square Support Vector Machine(LSSVM) with three different kernels:Linear(LSVVML), Polynomial(LSSVMP) and Radbas(LSSVMR). Experimental results reveal that the model developed using SMOTE is yielding better results when compared to the models developed with the original dataset. Furthermore, we also observe that the predictive model developed using LSSVM with linear and polynomial is more effective than the models developed using other classifier techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call