Limitation of ROC in Evaluation of Classifiers for Imbalanced Data

F Movahedi,J.F Antaki

doi:10.1016/j.healun.2021.01.1160

Abstract

PurposeROC is a common evaluation metric for risk scores and classifiers for mortality and adverse events. However, ROC can provide a misleadingly optimistic view of the performance of a classifier when the data are imbalanced, for example when proportion of adverse events is very small. This study illustrates the ambiguity of ROC through a case study of a classifier for post-LVAD Right Heart Failure (RHF), and illustrates the utility of the Precision Recall Curve (PRC) as a supplemental evolution tool.MethodsThis study included 11,967 patients recorded in INTERMACS who received a continuous-flow LVAD between 2006 and 2016 (mean age of 57; 21% female and 79% male) in which the incidence of RHF was only 9% at 1 year (1,079 patients). These data were randomly split into a training set (60%) and test set (40%). A logistic regression was developed using the training data to predict the post-LVAD RHF.ResultsROC in Fig.1.A indicates good performance of the RHF classifier with Area Under Curve (AUC) of 0.83. This is in contrast with the PRC in Fig.1. B with AUC of 0.33 shows the precision of the classifier drops rapidly from 1 (100%) to 0.4 (40%) as recall (sensitivity) increases slightly greater than 0%. The gray dot in Fig.1. A indicates the optimized point of equalized sensitivity and specificity of approximately 76-77%. In contrast, the corresponding precision of the classifier for the same sensitivity (76%) is only 23%. (See gray point in Fig.1.B) This means that only 23% of predicted RHF by this classifier is correct (True RHF). Thus, the preponderance (77%) of patients predicted to experience RHF are incorrectly classified (False RHF). The enormous predicted False RHF was not captured by ROC because False RHF in calculation of specificity is overwhelmed by the huge number of observed patients in the denominator who are free from RHF.ConclusionThe ROC can portray an overly-optimistic performance of a classifier or risk score when applied to imbalanced data. The PRC provide informative insight about the performance of classifier by focusing on the minority class. ROC is a common evaluation metric for risk scores and classifiers for mortality and adverse events. However, ROC can provide a misleadingly optimistic view of the performance of a classifier when the data are imbalanced, for example when proportion of adverse events is very small. This study illustrates the ambiguity of ROC through a case study of a classifier for post-LVAD Right Heart Failure (RHF), and illustrates the utility of the Precision Recall Curve (PRC) as a supplemental evolution tool. This study included 11,967 patients recorded in INTERMACS who received a continuous-flow LVAD between 2006 and 2016 (mean age of 57; 21% female and 79% male) in which the incidence of RHF was only 9% at 1 year (1,079 patients). These data were randomly split into a training set (60%) and test set (40%). A logistic regression was developed using the training data to predict the post-LVAD RHF. ROC in Fig.1.A indicates good performance of the RHF classifier with Area Under Curve (AUC) of 0.83. This is in contrast with the PRC in Fig.1. B with AUC of 0.33 shows the precision of the classifier drops rapidly from 1 (100%) to 0.4 (40%) as recall (sensitivity) increases slightly greater than 0%. The gray dot in Fig.1. A indicates the optimized point of equalized sensitivity and specificity of approximately 76-77%. In contrast, the corresponding precision of the classifier for the same sensitivity (76%) is only 23%. (See gray point in Fig.1.B) This means that only 23% of predicted RHF by this classifier is correct (True RHF). Thus, the preponderance (77%) of patients predicted to experience RHF are incorrectly classified (False RHF). The enormous predicted False RHF was not captured by ROC because False RHF in calculation of specificity is overwhelmed by the huge number of observed patients in the denominator who are free from RHF. The ROC can portray an overly-optimistic performance of a classifier or risk score when applied to imbalanced data. The PRC provide informative insight about the performance of classifier by focusing on the minority class.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Limitation of ROC in Evaluation of Classifiers for Imbalanced Data

Abstract

Talk to us

Similar Papers

More From: The Journal of Heart and Lung Transplantation

Lead the way for us

Journal: The Journal of Heart and Lung Transplantation	Publication Date: Mar 20, 2021
Citations: 5

Similar Papers

Limitations of receiver operating characteristic curve on imbalanced data: Assist device mortality risk scores
Faezeh Movahedi ... James F Antaki
The Journal of Thoracic and Cardiovascular Surgery | VOL. 165
Faezeh Movahedi, et. al.Faezeh Movahedi ... James F Antaki
30 Jul 2021
The Journal of Thoracic and Cardiovascular Surgery | VOL. 165

Short and Long Term Mortality in Patients with Right Heart Failure after Left Ventricular Assist Device Placement
Raul Angel Garcia ... Chantal El Amm
Journal of Cardiac Failure | VOL. 26
Raul Angel Garcia, et. al.Raul Angel Garcia ... Chantal El Amm
30 Sep 2020
Journal of Cardiac Failure | VOL. 26

Evolution of Late Right Heart Failure With Left Ventricular Assist Devices and Association With Outcomes
J Eduardo Rame ... Jeffrey J Teuteberg
Journal of the American College of Cardiology | VOL. 78
J Eduardo Rame, et. al.J Eduardo Rame ... Jeffrey J Teuteberg
29 Nov 2021
Journal of the American College of Cardiology | VOL. 78

Commentary: Lack of benefit of tricuspid valve intervention with left ventricular assist device implantation? It's all in the details
Francis D. Pagani
The Journal of Thoracic and Cardiovascular Surgery | VOL. 167
Francis D. PaganiFrancis D. Pagani
21 Jan 2023
The Journal of Thoracic and Cardiovascular Surgery | VOL. 167

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Limitation of ROC in Evaluation of Classifiers for Imbalanced Data

Abstract

Talk to us

Similar Papers

More From: The Journal of Heart and Lung Transplantation