Uncertain of uncertainties? A comparison of uncertainty quantification metrics for chemical data sets

Maria H Rasmussen,Chenru Duan,Heather J Kulik,Jan H Jensen

doi:10.1186/s13321-023-00790-0

Maria H Rasmussen, Chenru Duan + Show 2 more

Open Access

https://doi.org/10.1186/s13321-023-00790-0

Copy DOI

Journal: Journal of Cheminformatics	Publication Date: Dec 18, 2023
Citations: 8	License type: CC BY 4.0

Affiliation: University of Copenhagen

Abstract

With the increasingly more important role of machine learning (ML) models in chemical research, the need for putting a level of confidence to the model predictions naturally arises. Several methods for obtaining uncertainty estimates have been proposed in recent years but consensus on the evaluation of these have yet to be established and different studies on uncertainties generally uses different metrics to evaluate them. We compare three of the most popular validation metrics (Spearman’s rank correlation coefficient, the negative log likelihood (NLL) and the miscalibration area) to the error-based calibration introduced by Levi et al. (Sensors2022, 22, 5540). Importantly, metrics such as the negative log likelihood (NLL) and Spearman’s rank correlation coefficient bear little information in themselves. We therefore introduce reference values obtained through errors simulated directly from the uncertainty distribution. The different metrics target different properties and we show how to interpret them, but we generally find the best overall validation to be done based on the error-based calibration plot introduced by Levi et al. Finally, we illustrate the sensitivity of ranking-based methods (e.g. Spearman’s rank correlation coefficient) towards test set design by using the same toy model ferent test sets and obtaining vastly different metrics (0.05 vs. 0.65).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Uncertain of uncertainties? A comparison of uncertainty quantification metrics for chemical data sets

Abstract

Talk to us

Similar Papers

More From: Journal of Cheminformatics

Lead the way for us

Similar Papers

Uncertainty quantification metrics for deep regression
Simon Kristoffersson Lind ... Volker Krüger
Pattern Recognition Letters | VOL. 186
Simon Kristoffersson Lind, et. al.Simon Kristoffersson Lind ... Volker Krüger
19 Sep 2024
Pattern Recognition Letters | VOL. 186

Algorithmic fairness in computational medicine.
Jie Xu ... Jiang Bian
eBioMedicine | VOL. 84
Jie Xu, et. al.Jie Xu ... Jiang Bian
06 Sep 2022
eBioMedicine | VOL. 84

The prediction of distant metastasis risk for male breast cancer patients based on an interpretable machine learning model
Xuhai Zhao ... Cong Jiang
BMC Medical Informatics and Decision Making | VOL. 23
Xuhai Zhao, et. al.Xuhai Zhao ... Cong Jiang
21 Apr 2023
BMC Medical Informatics and Decision Making | VOL. 23

Development and validation of a novel blending machine learning model for hospital mortality prediction in ICU patients with Sepsis
Zhixuan Zeng ... Shuo Yao
BioData Mining | VOL. 14
Zhixuan Zeng, et. al.Zhixuan Zeng ... Shuo Yao
16 Aug 2021
BioData Mining | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Uncertain of uncertainties? A comparison of uncertainty quantification metrics for chemical data sets

Abstract

Talk to us

Similar Papers

More From: Journal of Cheminformatics