A Survey of Predictive Modeling on Imbalanced Domains

Paula Branco,Rita P Ribeiro,Luís Torgo

doi:10.1145/2907070

Paula Branco, Rita P Ribeiro + Show 1 more

Open Access

https://doi.org/10.1145/2907070

Copy DOI

Abstract

Many real-world data-mining applications involve obtaining predictive models using datasets with strongly imbalanced distributions of the target variable. Frequently, the least-common values of this target variable are associated with events that are highly relevant for end users (e.g., fraud detection, unusual returns on stock markets, anticipation of catastrophes, etc.). Moreover, the events may have different costs and benefits, which, when associated with the rarity of some of them on the available training data, creates serious problems to predictive modeling techniques. This article presents a survey of existing techniques for handling these important applications of predictive analytics. Although most of the existing work addresses classification tasks (nominal target variables), we also describe methods designed to handle similar problems within regression tasks (numeric target variables). In this survey, we discuss the main challenges raised by imbalanced domains, propose a definition of the problem, describe the main approaches to these tasks, propose a taxonomy of the methods, summarize the conclusions of existing comparative studies as well as some theoretical analyses of some methods, and refer to some related problems within predictive modeling.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ACM Computing Surveys	Publication Date: Aug 13, 2016
Citations: 783	License type: other-oa

R Discovery Prime

R Discovery Prime

A Survey of Predictive Modeling on Imbalanced Domains

Abstract

Talk to us

Similar Papers

More From: ACM Computing Surveys

Lead the way for us

Similar Papers

A Primer on Machine Learning.
Audrene S Edwards ... Bruce Kaplan
Transplantation | VOL. 105
Audrene S Edwards, et. al.Audrene S Edwards ... Bruce Kaplan
18 Aug 2020
Transplantation | VOL. 105

A novel cost‐sensitive algorithm and new evaluation strategies for regression in imbalanced domains
Lamyaa Sadouk ... El Hassan Essoufi
Expert Systems | VOL. 38
Lamyaa Sadouk, et. al.Lamyaa Sadouk ... El Hassan Essoufi
28 Feb 2021
Expert Systems | VOL. 38

Resampling with neighbourhood bias on imbalanced domains
Paula Branco ... Rita P Ribeiro
Expert Systems | VOL. 35
Paula Branco, et. al.Paula Branco ... Rita P Ribeiro
11 Jul 2018
Expert Systems | VOL. 35

Pre-processing approaches for imbalanced distributions in regression
Paula Branco ... Rita P Ribeiro
Neurocomputing | VOL. 343
Paula Branco, et. al.Paula Branco ... Rita P Ribeiro
03 Feb 2019
Neurocomputing | VOL. 343

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey of Predictive Modeling on Imbalanced Domains

Abstract

Talk to us

Similar Papers

More From: ACM Computing Surveys