The impact of class imbalance in classification performance metrics based on the binary confusion matrix

Amalia Luque,Alejandro Carrasco,Alejandro Martín,Ana De Las Heras

doi:10.1016/j.patcog.2019.02.023

Amalia Luque, Alejandro Carrasco + Show 2 more

Open Access

https://doi.org/10.1016/j.patcog.2019.02.023

Copy DOI

Export

Save

Cite

Journal: Pattern Recognition	Publication Date: Feb 28, 2019
Citations: 694	License type: cc-by-nc-nd

Affiliation: Universidad de Sevilla

Abstract
Full-Text
Similar Papers

Abstract

Listen

A major issue in the classification of class imbalanced datasets involves the determination of the most suitable performance metrics to be used. In previous work using several examples, it has been shown that imbalance can exert a major impact on the value and meaning of accuracy and on certain other well-known performance metrics. In this paper, our approach goes beyond simply studying case studies and develops a systematic analysis of this impact by simulating the results obtained using binary classifiers. A set of functions and numerical indicators are attained which enables the comparison of the behaviour of several performance metrics based on the binary confusion matrix when they are faced with imbalanced datasets. Throughout the paper, a new way to measure the imbalance is defined which surpasses the Imbalance Ratio used in previous studies. From the simulation results, several clusters of performance metrics have been identified that involve the use of Geometric Mean or Bookmaker Informedness as the best null-biased metrics if their focus on classification successes (dismissing the errors) presents no limitation for the specific application where they are used. However, if classification errors must also be considered, then the Matthews Correlation Coefficient arises as the best choice. Finally, a set of null-biased multi-perspective Class Balance Metrics is proposed which extends the concept of Class Balance Accuracy to other performance metrics.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

The impact of class imbalance in classification performance metrics based on the binary confusion matrix

Abstract

Published Version

Talk to us

Similar Papers

More From: Pattern Recognition

Lead the way for us

Similar Papers

SVDD-based weighted oversampling technique for imbalanced and overlapped dataset learning
Xinmin Tao ... Shan Huang
Information Sciences | VOL. 588
Xinmin Tao, et. al.Xinmin Tao ... Shan Huang
23 Dec 2021
Information Sciences | VOL. 588

Batch-balanced focal loss: a hybrid solution to class imbalance in deep learning.
Jatin Singh ... Zhiyi Shi
Journal of medical imaging (Bellingham, Wash.) | VOL. 10
Jatin Singh, et. al.Jatin Singh ... Zhiyi Shi
23 Jun 2023
Journal of medical imaging (Bellingham, Wash.) | VOL. 10

Minority–Majority Mix mean Oversampling Technique: An Efficient Technique to Improve Classification of Imbalanced Data Sets
Sachin Patil ... Shefali Sonavane
-
Sachin Patil, et. al.Sachin Patil ... Shefali Sonavane
17 Oct 2019
17 Oct 2019

Mind your prevalence!
Sébastien J. J. Guesné ... Thierry Hanser
Journal of Cheminformatics | VOL. 16
Sébastien J. J. Guesné, et. al.Sébastien J. J. Guesné ... Thierry Hanser
15 Apr 2024
Journal of Cheminformatics | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

The impact of class imbalance in classification performance metrics based on the binary confusion matrix

Abstract

Published Version

Talk to us

Similar Papers

More From: Pattern Recognition