Abstract

Risk prediction models are a crucial tool in healthcare. Risk prediction models with a binary outcome (i.e., binary classification models) are often constructed using methodology which assumes the costs of different classification errors are equal. In many healthcare applications, this assumption is not valid, and the differences between misclassification costs can be quite large. For instance, in a diagnostic setting, the cost of misdiagnosing a person with a life-threatening disease as healthy may be larger than the cost of misdiagnosing a healthy person as a patient. In this article, we present Tailored Bayes (TB), a novel Bayesian inference framework which "tailors" model fitting to optimize predictive performance with respect to unbalanced misclassification costs. We use simulation studies to showcase when TB is expected to outperform standard Bayesian methods in the context of logistic regression. We then apply TB to three real-world applications, a cardiac surgery, a breast cancer prognostication task, and a breast cancer tumor classification task and demonstrate the improvement in predictive performance over standard methods.

Highlights

  • Risk prediction models are widely used in healthcare (Roques and others, 2003; Hippisley-Cox and others, 2008; Wishart and others, 2012)

  • Models for binary outcomes are often constructed to minimize the expected classification error; that is, the proportion of incorrect classifications (Zhang, 2004; Steinwart, 2005; Bartlett and others, 2006). We refer to this paradigm as the standard classification paradigm. The disadvantage of this paradigm is that it implicitly assumes that all classification errors have equal costs, that is, the cost of misclassification of a positive label equals the cost of misclassification of a negative label. (Throughout the document, we refer to the costs of incorrect classifications as misclassification costs)

  • Real data applications We evaluate the performance of Tailored Bayes (TB) on three real-data applications involving a breast cancer prognostication task (Section 4.1), a cardiac surgery prognostication task (Section 4.2) and a breast cancer tumor classification task (Section S8 of the Supplementary material available at Biostatistics online)

Read more

Summary

Introduction

Risk prediction models are widely used in healthcare (Roques and others, 2003; Hippisley-Cox and others, 2008; Wishart and others, 2012). In cancer diagnosis, a false negative (i.e., misdiagnosing a cancer patient as healthy) could have more severe consequences than a false positive (i.e., misdiagnosing a healthy individual with cancer); the latter may lead to extra medical costs and unnecessary anxiety for the individual but not result in loss of life.. In cancer diagnosis, a false negative (i.e., misdiagnosing a cancer patient as healthy) could have more severe consequences than a false positive (i.e., misdiagnosing a healthy individual with cancer); the latter may lead to extra medical costs and unnecessary anxiety for the individual but not result in loss of life.1 For such applications, a prioritized control of asymmetric misclassification costs is needed In cancer diagnosis, a false negative (i.e., misdiagnosing a cancer patient as healthy) could have more severe consequences than a false positive (i.e., misdiagnosing a healthy individual with cancer); the latter may lead to extra medical costs and unnecessary anxiety for the individual but not result in loss of life. For such applications, a prioritized control of asymmetric misclassification costs is needed

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call