Ocena metod repróbkowania w problemie zbiorów niezbilansowanych

Mariusz Kubus

doi:10.15611/eada.2020.1.04

Abstract

The purpose of many real world applications is the prediction of rare events, and the training sets are then highly unbalanced. In this case, the classifiers are biased towards the correct prediction of the majority class and they misclassify a minority class, whereas rare events are of the greater interest. To handle this problem, numerous techniques were proposed that balance the data or modify the learning algorithms. The goal of this paper is a comparison of simple random balancing methods with more sophisticated resampling methods that appeared in the literature and are available in R program. Additionally, the authors ask whether learning on the original dataset and using a shifted threshold for classification is not more competitive. The authors provide a survey from the perspective of regularized logistic regression and random forests. The results show that combining random under-sampling with random forests has an advantage over other techniques while logistic regression can be competitive in the case of highly unbalanced data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Econometrics	Publication Date: Jan 1, 2020
Citations: 7	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Ocena metod repróbkowania w problemie zbiorów niezbilansowanych

Abstract

Talk to us

Similar Papers

More From: Econometrics

Lead the way for us

Similar Papers

Comparison metrics for multi-step prediction of rare events in vital sign signals
Pravinkumar G Kandhare ... Arie Nakhmani
Biomedical Signal Processing and Control | VOL. 80
Pravinkumar G Kandhare, et. al.Pravinkumar G Kandhare ... Arie Nakhmani
17 Nov 2022
Biomedical Signal Processing and Control | VOL. 80

High-throughput experiments for rare-event rupture of materials
Yifan Zhou ... Tongqing Lu
Matter | VOL. 5
Yifan Zhou, et. al.Yifan Zhou ... Tongqing Lu
20 Jan 2022
Matter | VOL. 5

System Failure Prediction through Rare-Events Elastic-Net Logistic Regression
Jose M Navarro ... Juan C Duenas
-
Jose M Navarro, et. al.Jose M Navarro ... Juan C Duenas
01 Nov 2014
01 Nov 2014

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?
Ashley B Anderson ... Jonathan F Dickens
Clinical Orthopaedics & Related Research | VOL. 478
Ashley B Anderson, et. al.Ashley B Anderson ... Jonathan F Dickens
07 Apr 2020
Clinical Orthopaedics & Related Research | VOL. 478

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ocena metod repróbkowania w problemie zbiorów niezbilansowanych

Abstract

Talk to us

Similar Papers

More From: Econometrics