Abstract

Abstract Background In clinical scenarios, incorrect predictions by machine learning (ML) models are inevitable. One way to reduce misleading predictions is to avoid reporting predictions that fall within a predefined “gray zone”. This method improves predictive performance by reporting only the less uncertain cases. However, the cost-effectiveness of applying the “gray zone” rule in an ML model is unclear without massive computation. Thus, this study aims to propose a novel metric to evaluate the effectiveness of using gray zones and validate the metric in real-world ML models. Methods This study defined a statistical metric called the “discriminative index” (D-index) for evaluating the effectiveness of gray zones. To calculate the D-index, the predictive outcomes of the ML model are first transformed into two probability distributions based on the truth labels (e.g., positive or negative). The D-index is then derived from the kurtosis of these two distributions. To validate the metric, we applied the D-index to three different antibiotic susceptibility-predicting ML models (namely, convolutional neural network (CNN), random forest (RF), and XGBoost (XGB)) based on mass spectrometry data. We assessed the performance and unpredicted case numbers of each model with different gray zones and correlated the results with the proposed D-index. Results The D-index values for the CNN, XGB, and RF models were 4.36, 0.38, and −1.66, respectively. When applying the “gray zone” rule to achieve 90% area under the receiver operating characteristic, the CNN, XGB and RF models retained up to 90%, 68%, and 62% of total cases, respectively. A higher D-index value indicates a more effective application of the gray zone rule. Conclusion The D-index is a simple and statistically insightful metric for evaluating the cost-effectiveness of applying gray zone rule in an ML model. This metric has been validated in three mass spectrometry-based predictive models and has shown promising results. The D-index can be a useful tool for comparing and applying different ML algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call