Performance Of Machine Learning Classifiers Research Articles

Background: Continuous modifications, suboptimal software design practices, and stringent project deadlines contribute to the proliferation of code smells. Detecting and refactoring these code smells are pivotal to maintaining complex and essential software systems. Neglecting them may lead to future software defects, rendering systems challenging to maintain, and eventually obsolete. Supervised machine learning techniques have emerged as valuable tools for classifying code smells without needing expert knowledge or fixed threshold values. Further enhancement of classifier performance can be achieved through effective feature selection techniques and the optimization of hyperparameter values. Aim: Performance measures of multiple machine learning classifiers are improved by fine tuning its hyperparameters using various type of meta-heuristic algorithms including swarm intelligent, physics, math, and bio-based etc. Their performance measures are compared to find the best meta-heuristic algorithm in the context of code smell detection and its impact is evaluated based on statistical tests. Method: This study employs sixteen contemporary and robust meta-heuristic algorithms to optimize the hyperparameters of two machine learning algorithms: Support Vector Machine (SVM) and k-nearest Neighbors (k-NN). The No Free Lunch theorem underscores that the success of an optimization algorithm in one application may not necessarily extend to others. Consequently, a rigorous comparative analysis of these algorithms is undertaken to identify the best-fit solutions for code smell detection. A diverse range of optimization algorithms, encompassing Arithmetic, Jellyfish Search, Flow Direction, Student Psychology Based, Pathfinder, Sine Cosine, Jaya, Crow Search, Dragonfly, Krill Herd, Multi-Verse, Symbiotic Organisms Search, Flower Pollination, Teaching Learning Based, Gravitational Search, and Biogeography-Based Optimization, have been implemented. Results: In the case of optimized SVM, the highest attained accuracy, AUC, and F-measure values are 98.75%, 100%, and 98.57%, respectively. Remarkably, significant increases in accuracy and AUC, reaching 32.22% and 45.11% respectively, are observed. For k-NN, the best accuracy, AUC, and F-measure values are all perfect at 100%, with noteworthy hikes in accuracy and ROC-AUC values, amounting to 43.89% and 40.83%, respectively. Conclusion: Optimized SVM exhibits exceptional performance with the Sine Cosine Optimization algorithm, while k-NN attains its peak performance with the Flower Optimization algorithm. Statistical analysis underscores the substantial impact of employing meta-heuristic algorithms for optimizing machine learning classifiers, enhancing their performance significantly. Optimized SVM excels in detecting the God Class, while optimized k-NN is particularly effective in identifying the Data Class. This innovative fusion automates the tuning process and elevates classifier performance, simultaneously addressing multiple longstanding challenges.

Read full abstract

In recent years, several catastrophic landslide events have been observed throughout the globe, threatening to lives and infrastructures. To minimize the impact of landslides, the need of landslide susceptibility map is important. The study aims to extract high-quality non-landslide samples and improve the accuracy of landslide susceptibility modelling (LSM) outcomes by applying a coupled method of ensemble learning and Machine Learning (ML). The Zigui-Badong section of the Three Gorges Reservoir area (TGRA) in China was considered in the present study. Twelve influencing factors were selected as inputs for LSM, and the relationship between each causal factor and landslide spatial development was quantitatively analyzed. A total of 179 landslides have been used in the present study. About 70% of the landslide pixels were randomly considered for training, and the remaining 30% were used for validation. Logistic Regression (LR) model was applied to produce an initial susceptibility map, and the non-landslide samples were selected within the classified low-susceptibility zone. Subsequently, two ML classifiers – the Classification and Regression Tree (CART), and the Multi-Layer Perceptron (MLP), and four coupling models – the CART-Bagging, CART-Boosting, MLP-Bagging, and MLP-Boosting, were utilized for LSM. Finally, the receiver operating characteristics (ROC) curve and statistical analysis were applied for accuracy assessment. The results show that altitude and distance to rivers were the main causal factors of landslides in the study area. The LR-MLP-Boosting performed the best with an accuracy of 0.986 followed by the LR-CART-Bagging, LR-CART-Boosting, and LR-MLP-Bagging. Accuracy comparisons demonstrate that ensemble learning algorithm can notably enhance the LSM performance of ML classifiers, and the Boosting algorithm marginally outperforms the Bagging algorithm. Moreover, the LR model can effectively constrain the selection range of non-landslide samples. The non-landslide sampling method constrained by LR yields higher quality samples compared to raditional random sampling method with no constraints, which develops a more excellent LSM.

Read full abstract

Performance Of Machine Learning Classifiers Research Articles

Related Topics

Articles published on Performance Of Machine Learning Classifiers

Applications of Text Mining techniques to extract meaningful information from gastroenterology medical reports

The effect of data complexity on classifier performance

Improving and comparing performance of machine learning classifiers optimized by swarm intelligent algorithms for code smell detection

Enhanced Brain Tumor Classification: A Hybrid Classifier Approach

Retracted

Exploring Stress Detection in Tweets: A Comparative Business Analysis of Classification Algorithms

Revolutionizing gastric cancer diagnosis through advanced machine learning approaches

Analysis of Machine Learning Classifiers for Speaker Identification: A Study on SVM, Random Forest, KNN, and Decision Tree

Boosting and Comparing Performance of Machine Learning Classifiers with Meta-heuristic Techniques to Detect Code Smell

Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique

Enhancing Parkinson's disease severity assessment through voice-based wavelet scattering, optimized model selection, and weighted majority voting

Heart Failure prediction on diversified datasets to improve generalizability using 2-Level Stacking

Automated assessment of foot elevation in adults with hereditary spastic paraplegia using inertial measurements and machine learning

Non-contrast CT synthesis using patch-based cycle-consistent generative adversarial network (Cycle-GAN) for radiomics and deep learning in the era of COVID-19

A Machine Learning Method with Hybrid Feature Selection for Improved Credit Card Fraud Detection

From modeling dose-response relationships to improved performance of decision-tree classifiers for predictive toxicology of nanomaterials

Strain FBG-Based Sensor for Detecting Fence Intruders Using Machine Learning and Adaptive Thresholding.

Evaluating the Performance of Machine Learning Classifiers for Detecting Twitter Spam

Performance of machine learning algorithms for dementia assessment: impacts of language tasks, recording media, and modalities

Deriving quantitative information from multiparametric MRI via Radiomics: Evaluation of the robustness and predictive value of radiomic features in the discrimination of low-grade versus high-grade gliomas with machine learning

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Performance Of Machine Learning Classifiers Research Articles

Related Topics

Articles published on Performance Of Machine Learning Classifiers

Applications of Text Mining techniques to extract meaningful information from gastroenterology medical reports

The effect of data complexity on classifier performance

Improving and comparing performance of machine learning classifiers optimized by swarm intelligent algorithms for code smell detection

Enhanced Brain Tumor Classification: A Hybrid Classifier Approach

Retracted

Exploring Stress Detection in Tweets: A Comparative Business Analysis of Classification Algorithms

Revolutionizing gastric cancer diagnosis through advanced machine learning approaches

Analysis of Machine Learning Classifiers for Speaker Identification: A Study on SVM, Random Forest, KNN, and Decision Tree

Boosting and Comparing Performance of Machine Learning Classifiers with Meta-heuristic Techniques to Detect Code Smell

Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique

Enhancing Parkinson's disease severity assessment through voice-based wavelet scattering, optimized model selection, and weighted majority voting

Heart Failure prediction on diversified datasets to improve generalizability using 2-Level Stacking

Automated assessment of foot elevation in adults with hereditary spastic paraplegia using inertial measurements and machine learning

Non-contrast CT synthesis using patch-based cycle-consistent generative adversarial network (Cycle-GAN) for radiomics and deep learning in the era of COVID-19

A Machine Learning Method with Hybrid Feature Selection for Improved Credit Card Fraud Detection

From modeling dose-response relationships to improved performance of decision-tree classifiers for predictive toxicology of nanomaterials

Strain FBG-Based Sensor for Detecting Fence Intruders Using Machine Learning and Adaptive Thresholding.

Evaluating the Performance of Machine Learning Classifiers for Detecting Twitter Spam

Performance of machine learning algorithms for dementia assessment: impacts of language tasks, recording media, and modalities

Deriving quantitative information from multiparametric MRI via Radiomics: Evaluation of the robustness and predictive value of radiomic features in the discrimination of low-grade versus high-grade gliomas with machine learning