General Performance Score for classification problems

Isaac Martín De Diego,Rubén R Fernández,Javier M Moguerza,Ana R Redondo,Jorge Navarro

doi:10.1007/s10489-021-03041-7

Isaac Martín De Diego, Rubén R Fernández + Show 3 more

Open Access

https://doi.org/10.1007/s10489-021-03041-7

Copy DOI

Abstract

Several performance metrics are currently available to evaluate the performance of Machine Learning (ML) models in classification problems. ML models are usually assessed using a single measure because it facilitates the comparison between several models. However, there is no silver bullet since each performance metric emphasizes a different aspect of the classification. Thus, the choice depends on the particular requirements and characteristics of the problem. An additional problem arises in multi-class classification problems, since most of the well-known metrics are only directly applicable to binary classification problems. In this paper, we propose the General Performance Score (GPS), a methodological approach to build performance metrics for binary and multi-class classification problems. The basic idea behind GPS is to combine a set of individual metrics, penalising low values in any of them. Thus, users can combine several performance metrics that are relevant in the particular problem based on their preferences obtaining a conservative combination. Different GPS-based performance metrics are compared with alternatives in classification problems using real and simulated datasets. The metrics built using the proposed method improve the stability and explainability of the usual performance metrics. Finally, the GPS brings benefits in both new research lines and practical usage, where performance metrics tailored for each particular problem are considered.

Highlights

Supervised Learning is the set of Machine Learning (ML) techniques that use labelled data
True Positive Rate (TPR) is usually plotted versus False Positive Rate (FPR) ( FPR = 1 − True Negative Rate (TNR) )
Given that the combined harmonic mean of two sets of variables is equal to the harmonic mean of the harmonic means of the two sets [18], the previous expression can be simplified to: General Performance Score (GPS)(PPV, TPR, TNR, Negative Predictive Value (NPV) )

Summary

Introduction

Supervised Learning is the set of Machine Learning (ML) techniques that use labelled data. Given a classification ML model, the information regarding its performance is summarised into a confusion matrix This matrix is built by comparing the observed and predicted classes for a set of observations. In many binary classification problems, alternative measures that combine two metrics regarding the classification task in both classes are more appropriate. The GPS is obtained from the combination of several metrics estimated through a K × K confusion matrix, with K ≥ 2 This family of metrics performs for both binary and multi-class classification. – A novel family of performance metrics, GPS, is developed for both binary and multi-class classification.

Binary classification

Multi‐class classification

General Performance Score

Experiments

Simulated confusion matrices in binary classification

Binary classification with real datasets

Simulated confusion matrices in multi‐class classification

Multi‐class classification with real datasets

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied intelligence (Dordrecht, Netherlands)	Publication Date: Jan 31, 2022
Citations: 45	License type: open-access

R Discovery Prime

R Discovery Prime

General Performance Score for classification problems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied intelligence (Dordrecht, Netherlands)

Lead the way for us

Similar Papers

Machine Learning Models for Blood Glucose Level Prediction in Patients With Diabetes Mellitus: Systematic Review and Network Meta-Analysis.
Kui Liu ... Zhenhua Liu
JMIR medical informatics | VOL. 11
Kui Liu, et. al.Kui Liu ... Zhenhua Liu
20 Nov 2023
JMIR medical informatics | VOL. 11

Does Artificial Intelligence Outperform Natural Intelligence in Interpreting Musculoskeletal Radiological Studies? A Systematic Review.
Olivier Q. Groot ... Aditya V. Karhade
Clinical Orthopaedics and Related Research | VOL. 478
Olivier Q. Groot, et. al.Olivier Q. Groot ... Aditya V. Karhade
30 Jul 2020
Clinical Orthopaedics and Related Research | VOL. 478

Automatic evaluation of the Nine-Hole Peg Test in multiple sclerosis patients using machine learning models
A Balaceanu ... Á Gutiérrez
Biomedical signal processing and control | VOL. 92
A Balaceanu, et. al.A Balaceanu ... Á Gutiérrez
22 Feb 2024
Biomedical signal processing and control | VOL. 92

Evaluation of Machine Learning Models on Electrochemical CO2 Reduction Using Human Curated Datasets
Brianna R. Farris ... Tevin Niang-Trost
ACS sustainable chemistry & engineering | VOL. 10
Brianna R. Farris, et. al.Brianna R. Farris ... Tevin Niang-Trost
10 Aug 2022
ACS sustainable chemistry & engineering | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

General Performance Score for classification problems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied intelligence (Dordrecht, Netherlands)