Evaluating probabilistic classifiers: The triptych

Timo Dimitriadis,Tilmann Gneiting,Alexander I Jordan,Peter Vogel

doi:10.1016/j.ijforecast.2023.09.007

Timo Dimitriadis, Tilmann Gneiting + Show 2 more

Open Access

https://doi.org/10.1016/j.ijforecast.2023.09.007

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Probability forecasts for binary outcomes, often referred to as probabilistic classifiers or confidence scores, are ubiquitous in science and society, and methods for evaluating and comparing them are in great demand. We propose and study a triptych of diagnostic graphics focusing on distinct and complementary aspects of forecast performance: Reliability curves address calibration, receiver operating characteristic (ROC) curves diagnose discrimination ability, and Murphy curves visualize overall predictive performance and value. A Murphy curve shows a forecast’s mean elementary scores, including the widely used misclassification rate, and the area under a Murphy curve equals the mean Brier score. For a calibrated forecast, the reliability curve lies on the diagonal, and for competing calibrated forecasts, the ROC and Murphy curves share the same number of crossing points. We invoke the recently developed CORP (Consistent, Optimally binned, Reproducible, and Pool-Adjacent-Violators (PAV) algorithm-based) approach to craft reliability curves and decompose a mean score into miscalibration (MCB), discrimination (DSC), and uncertainty (UNC) components. Plots of the DSC measure of discrimination ability versus the calibration metric MCB visualize classifier performance across multiple competitors. The proposed tools are illustrated in empirical examples from astrophysics, economics, and social science.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Forecasting	Publication Date: Nov 1, 2023
Citations: 2	License type: cc-by

R Discovery Prime

Evaluating probabilistic classifiers: The triptych

Abstract

Published Version

Talk to us

Similar Papers

More From: International Journal of Forecasting

Lead the way for us

Similar Papers

Development of prediction model for identifying heart failure patients with high risk of developing hyponatremia
...
-
, et. al. ...
30 Aug 2019
30 Aug 2019

The Conduct and Reporting of Meta-Analyses of Studies of Diagnostic Tests, and a Consideration of ROC Curves: Answers to the January 2010 Journal Club Questions
Teri A Reynolds ... David L Schriger
Annals of Emergency Medicine | VOL. 55
Teri A Reynolds, et. al.Teri A Reynolds ... David L Schriger
21 May 2010
Annals of Emergency Medicine | VOL. 55

Predicting glucose intolerance with normal fasting plasma glucose by the components of the metabolic syndrome
Dee Pei ... Jer-Chuan Li
Annals of Saudi Medicine | VOL. 27
Dee Pei, et. al.Dee Pei ... Jer-Chuan Li
01 Sep 2007
Annals of Saudi Medicine | VOL. 27

Evaluating machine learning methods: scored receiver operating characteristics (sroc) curves

-

01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Evaluating probabilistic classifiers: The triptych

Abstract

Published Version

Talk to us

Similar Papers

More From: International Journal of Forecasting