The Application of The Neighborhood Cleaning Rule in Conjunction with Random Forest, K-Fold Cross-Validation, and Grid Search for Addressing Imbalanced Datasets

Handayani Handayani,Laelatul Hikmah,Laila Qadrini,Muh Hijrah

doi:10.47065/tin.v3i8.4124

Abstract

Finding a model that explains and separates data classes is the process of classification in data mining, which is used to guess the class of an item with an unknown class. Numerous strategies have been developed since categorization can be applied in a wide range of applications. But a common issue with classification is class imbalance. Data predictability suffers as a result of the issue of unbalanced classes. There are typically not an equal number of examples in each class in real-world categorization datasets. Class imbalance is not a problem when there are not significant differences in how the classes are distributed. Due to class imbalance, prediction models may skew in favor of the majority class, with the minority class contributing little to the model. One often used strategy for addressing class imbalance is the resampling technique. This study's objective is to put the Resampling Algorithm into practice. Neighborhood Cleaning Rule Random Forest K-Fold Tune Grid Search was carried out on a dataset that includes cases of Low Birth Weight Infants (BBLR) in Majene Regency and breast cancer diagnoses, which was posted on the UCI website. The Neighborhood Cleaning Rule (NCL), a data processing method, eliminates noise or other disturbances from datasets used for modeling or analysis. The F1-Score, G-Mean, Accuracy, and Sensitivity values from the model are good.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The Application of The Neighborhood Cleaning Rule in Conjunction with Random Forest, K-Fold Cross-Validation, and Grid Search for Addressing Imbalanced Datasets

Abstract

Talk to us

Similar Papers

More From: TIN: Terapan Informatika Nusantara

Lead the way for us

Journal: TIN: Terapan Informatika Nusantara	Publication Date: Jan 30, 2023
License type: CC BY 4.0

Similar Papers

Comprehensive empirical investigation for prioritizing the pipeline of using feature selection and data resampling techniques
Pooja Tyagi ... Anjana Gosain
Journal of Intelligent & Fuzzy Systems | VOL. 46
Pooja Tyagi, et. al.Pooja Tyagi ... Anjana Gosain
05 Mar 2024
Journal of Intelligent & Fuzzy Systems | VOL. 46

A Novel Parameters’ Identification Procedure for Aortic Walls Based on Hybrid Artificial Intelligence Approaches
Li Yang ... Shi Zhengjia
International Journal of Computational Methods | VOL. 20
Li Yang, et. al.Li Yang ... Shi Zhengjia
28 Mar 2022
International Journal of Computational Methods | VOL. 20

Undersampling dan K-Fold Random Forest Untuk Klasifikasi Kelas Tidak Seimbang
Laila Qadrini
Building of Informatics, Technology and Science (BITS) | VOL. 4
Laila QadriniLaila Qadrini
31 Mar 2023
Building of Informatics, Technology and Science (BITS) | VOL. 4

Analyzing Resampling Techniques for Addressing the Class Imbalance in NIDS using SVM with Random Forest Feature Selection
K Swarnalatha ... Nirmalajyothi Narisetty
International Journal of Experimental Research and Review | VOL. 43
K Swarnalatha, et. al.K Swarnalatha ... Nirmalajyothi Narisetty
30 Sep 2024
International Journal of Experimental Research and Review | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Application of The Neighborhood Cleaning Rule in Conjunction with Random Forest, K-Fold Cross-Validation, and Grid Search for Addressing Imbalanced Datasets

Abstract

Talk to us

Similar Papers

More From: TIN: Terapan Informatika Nusantara