Smell Detection Research Articles

SummaryMachine learning‐based code smell detection (CSD) has been demonstrated to be a valuable approach for improving software quality and enabling developers to identify problematic patterns in code. However, previous researches have shown that the code smell datasets commonly used to train these models are heavily imbalanced. While some recent studies have explored the use of imbalanced learning techniques for CSD, they have only evaluated a limited number of techniques and thus their conclusions about the most effective methods may be biased and inconclusive. To thoroughly evaluate the effect of imbalanced learning techniques for machine learning‐based CSD, we examine 31 imbalanced learning techniques with seven classifiers to build CSD models on four code smell data sets. We employ four evaluation metrics to assess the detection performance with the Wilcoxon signed‐rank test and Cliff's . The results show that (1) Not all imbalanced learning techniques significantly improve detection performance, but deep forest significantly outperforms the other techniques on all code smell data sets. (2) SMOTE (Synthetic Minority Over‐sampling TEchnique) is not the most effective technique for resampling code smell data sets. (3) The best‐performing imbalanced learning techniques and the top‐3 data resampling techniques have little time cost for code smell detection. Therefore, we provide some practical guidelines. First, researchers and practitioners should select the appropriate imbalanced learning techniques (e.g., deep forest) to ameliorate the class imbalance problem. In contrast, the blind application of imbalanced learning techniques could be harmful. Then, better data resampling techniques than SMOTE should be selected to preprocess the code smell data sets.

Read full abstract

Many software metrics are designed to measure aspects that are believed to be related to software quality. Static software metrics, e.g., size, complexity and coupling are used in defect prediction research as well as software quality models to evaluate software quality. Static analysis tools also include boundary values for complexity and size that generate warnings for developers. While this indicates a relationship between quality and software metrics, the extent of it is not well understood. Moreover, recent studies found that complexity metrics may be unreliable indicators for understandability of the source code. To explore this relationship, we leverage the intent of developers about what constitutes a quality improvement in their own code base. We manually classify a randomized sample of 2,533 commits from 54 Java open source projects as quality improving depending on the intent of the developer by inspecting the commit message. We distinguish between perfective and corrective maintenance via predefined guidelines and use this data as ground truth for the fine-tuning of a state-of-the art deep learning model for natural language processing. The benchmark we provide with our ground truth indicates that the deep learning model can be confidently used for commit intent classification. We use the model to increase our data set to 125,482 commits. Based on the resulting data set, we investigate the differences in size and 14 static source code metrics between changes that increase quality, as indicated by the developer, and changes unrelated to quality. In addition, we investigate which files are targets of quality improvements. We find that quality improving commits are smaller than non-quality improving commits. Perfective changes have a positive impact on static source code metrics while corrective changes do tend to add complexity. Furthermore, we find that files which are the target of perfective maintenance already have a lower median complexity than files which are the target of non-pervective changes. Our study results provide empirical evidence for which static source code metrics capture quality improvement from the developers point of view. This has implications for program understanding as well as code smell detection and recommender systems.

Read full abstract

Smell Detection Research Articles

Related Topics

Articles published on Smell Detection

A Novel Transfer Learning Method for Code Smell Detection on Heterogeneous Data: A Feasibility Study

Dynamic detection of accessibility smells

Security‐based code smell definition, detection, and impact quantification in Android

RETRACTED ARTICLE: Developing algorithmic business resource optimization model for code smells detection: an applied case insight from enterprise level software management system

Integrating Interactive Detection of Code Smells into Scrum: Feasibility, Benefits, and Challenges

A systematic literature review on source code similarity measurement and clone detection: Techniques, applications, and challenges

KubeHound: Detecting Microservices’ Security Smells in Kubernetes Deployments

On the relative value of imbalanced learning for code smell detection

Python code smells detection using conventional machine learning models.

Deep learning approaches for bad smell detection: a systematic literature review

Intelligent Mining of Association Rules Based on Nanopatterns for Code Smells Detection

Resource Allocation Modeling Framework to Refactor Software Design Smells

Code quality improvement using Aquila Optimizer

What really changes when developers intend to improve their source code: a commit-level study of static metric value and static analysis warning changes

On the Assessment of Interactive Detection of Code Smells in Practice: A Controlled Experiment

Ablation of AQP5 gene in mice leads to olfactory dysfunction caused by hyposecretion of Bowman's gland.

Static Code Analysis: A Tree of Science Review

Smell Detection Agent Optimization Approach to Path Generation in Automated Software Testing

Metric-based rule optimizing system for code smell detection using Salp Swarm and Cockroach Swarm algorithm

Hybrid Model with Multi-Level Code Representation for Multi-Label Code Smell Detection (077)

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Smell Detection Research Articles

Related Topics

Articles published on Smell Detection

A Novel Transfer Learning Method for Code Smell Detection on Heterogeneous Data: A Feasibility Study

Dynamic detection of accessibility smells

Security‐based code smell definition, detection, and impact quantification in Android

RETRACTED ARTICLE: Developing algorithmic business resource optimization model for code smells detection: an applied case insight from enterprise level software management system

Integrating Interactive Detection of Code Smells into Scrum: Feasibility, Benefits, and Challenges

A systematic literature review on source code similarity measurement and clone detection: Techniques, applications, and challenges

KubeHound: Detecting Microservices’ Security Smells in Kubernetes Deployments

On the relative value of imbalanced learning for code smell detection

Python code smells detection using conventional machine learning models.

Deep learning approaches for bad smell detection: a systematic literature review

Intelligent Mining of Association Rules Based on Nanopatterns for Code Smells Detection

Resource Allocation Modeling Framework to Refactor Software Design Smells

Code quality improvement using Aquila Optimizer

What really changes when developers intend to improve their source code: a commit-level study of static metric value and static analysis warning changes

On the Assessment of Interactive Detection of Code Smells in Practice: A Controlled Experiment

Ablation of AQP5 gene in mice leads to olfactory dysfunction caused by hyposecretion of Bowman's gland.

Static Code Analysis: A Tree of Science Review

Smell Detection Agent Optimization Approach to Path Generation in Automated Software Testing

Metric-based rule optimizing system for code smell detection using Salp Swarm and Cockroach Swarm algorithm

Hybrid Model with Multi-Level Code Representation for Multi-Label Code Smell Detection (077)