A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique

Rajwant Singh Rao,Seema Dewangan,Alok Mishra,Manjari Gupta

doi:10.1038/s41598-023-43380-8

Abstract

Detecting code smells may be highly helpful for reducing maintenance costs and raising source code quality. Code smells facilitate developers or researchers to understand several types of design flaws. Code smells with high severity can cause significant problems for the software and may cause challenges for the system's maintainability. It is quite essential to assess the severity of the code smells detected in software, as it prioritizes refactoring efforts. The class imbalance problem also further enhances the difficulties in code smell severity detection. In this study, four code smell severity datasets (Data class, God class, Feature envy, and Long method) are selected to detect code smell severity. In this work, an effort is made to address the issue of class imbalance, for which, the Synthetic Minority Oversampling Technique (SMOTE) class balancing technique is applied. Each dataset's relevant features are chosen using a feature selection technique based on principal component analysis. The severity of code smells is determined using five machine learning techniques: K-nearest neighbor, Random forest, Decision tree, Multi-layer Perceptron, and Logistic Regression. This study obtained the 0.99 severity accuracy score with the Random forest and Decision tree approach with the Long method code smell. The model's performance is compared based on its accuracy and three other performance measurements (Precision, Recall, and F-measure) to estimate severity classification models. The impact of performance is also compared and presented with and without applying SMOTE. The results obtained in the study are promising and can be beneficial for paving the way for further studies in this area.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Sep 27, 2023
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique

Abstract

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

An Exploratory Evaluation of Continuous Feedback to Enhance Machine Learning Code Smell Detection
Daniel Cruz ... Eduardo Figueiredo
-
Daniel Cruz, et. al.Daniel Cruz ... Eduardo Figueiredo
06 May 2024
06 May 2024

Optimizing LSTM for Code Smell Detection: The Role of Data Balancing
Alnor Adam Khleel Nasraldeen ... Károly Nehéz
Infocommunications journal | VOL. 16
Alnor Adam Khleel Nasraldeen, et. al.Alnor Adam Khleel Nasraldeen ... Károly Nehéz
01 Jan 2024
Infocommunications journal | VOL. 16

Improving accuracy of code smells detection using machine learning with data balancing techniques
Nasraldeen Alnor Adam Khleel ... Károly Nehéz
The Journal of Supercomputing | VOL. 80
Nasraldeen Alnor Adam Khleel, et. al.Nasraldeen Alnor Adam Khleel ... Károly Nehéz
05 Jun 2024
The Journal of Supercomputing | VOL. 80

Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review
Ahmed Al-Shaaby ... Mohammad Alshayeb
Arabian Journal for Science and Engineering | VOL. 45
Ahmed Al-Shaaby, et. al.Ahmed Al-Shaaby ... Mohammad Alshayeb
07 Jan 2020
Arabian Journal for Science and Engineering | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique

Abstract

Talk to us

Similar Papers

More From: Scientific Reports