Comparative Analysis of Cross-Validation Techniques: LOOCV, K-folds Cross-Validation, and Repeated K-folds Cross-Validation in Machine Learning Models

Victor Lumumba,Dennis Kiprotich,Mary Mpaine,Njoka Makena,Musyimi Kavita

doi:10.11648/j.ajtas.20241305.13

Abstract

Effective model evaluation is crucial for robust machine learning, and cross-validation techniques play a significant role. This study compares Repeated k-folds Cross Validation, k-folds Cross Validation, and Leave-One-Out Cross Validation (LOOCV) on imbalanced and balanced datasets across four models: Support Vector Machine (SVM), K-Nearest Neighbors (K-NN), Random Forest (RF), and Bagging, both with and without parameter tuning. On imbalanced data without parameter tuning, Repeated k-folds cross-validation demonstrated strong performance for SVM with a sensitivity of 0.541 and balanced accuracy of 0.764. K-folds Cross Validation showed a higher sensitivity of 0.784 for RF and a balanced accuracy of 0.884. In contrast, LOOCV achieved notable sensitivity for RF and Bagging at 0.787 and 0.784, respectively, but at the cost of lower precision and higher variance, as detailed in Table 1. When parameter tuning was applied to balanced data, the performance metrics improved. Sensitivity for SVM reached 0.893 with LOOCV and balanced accuracy for Bagging increased to 0.895. Stratified k-folds provided enhanced precision and F1-Score for SVM and RF. Notably, processing times varied significantly, with k-folds being the most efficient with SVM taking 21.480 seconds and Repeated k-folds showing higher computational demands where RF took approximately 1986.570 seconds in model processing, as shown in Table 4. This analysis underscores that while k-folds and repeated k-folds are generally efficient, LOOCV and balanced approaches offer enhanced accuracy for specific models but require greater computational resources. The choice of cross-validation technique should thus be tailored to the dataset characteristics and computational constraints to ensure optimal model evaluation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparative Analysis of Cross-Validation Techniques: LOOCV, K-folds Cross-Validation, and Repeated K-folds Cross-Validation in Machine Learning Models

Abstract

Talk to us

Similar Papers

More From: American Journal of Theoretical and Applied Statistics

Lead the way for us

Similar Papers

Stochastic cross validation
Lu Xu ... Yuan-Bin She
Chemometrics and Intelligent Laboratory Systems | VOL. 175
Lu Xu, et. al.Lu Xu ... Yuan-Bin She
20 Feb 2018
Chemometrics and Intelligent Laboratory Systems | VOL. 175

High-content phenotyping of Parkinson's disease patient stem cell-derived midbrain dopaminergic neurons using machine learning classification.
Aurore Vuidel ... Michael Peitz
Stem cell reports | VOL. 17
Aurore Vuidel, et. al.Aurore Vuidel ... Michael Peitz
29 Sep 2022
Stem cell reports | VOL. 17

Nearest neighbour distance matching Leave‐One‐Out Cross‐Validation for map validation
Carles Milà ... Hanna Meyer
Methods in Ecology and Evolution | VOL. 13
Carles Milà, et. al.Carles Milà ... Hanna Meyer
07 Apr 2022
Methods in Ecology and Evolution | VOL. 13

Comparasion of Error Rate Prediction Methods of C4.5 Algorithm for Balanced Data
Ichlas Djuazva ... Zilrahmi
UNP Journal of Statistics and Data Science | VOL. 1
Ichlas Djuazva, et. al. Ichlas Djuazva ... Zilrahmi
28 Aug 2023
UNP Journal of Statistics and Data Science | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative Analysis of Cross-Validation Techniques: LOOCV, K-folds Cross-Validation, and Repeated K-folds Cross-Validation in Machine Learning Models

Abstract

Talk to us

Similar Papers

More From: American Journal of Theoretical and Applied Statistics