Abstract

BackgroundMany measures of prediction accuracy have been developed. However, the most popular ones in typical medical outcome prediction settings require additional investigation of calibration.MethodsWe show how rescaling the Brier score produces a measure that combines discrimination and calibration in one value and improves interpretability by adjusting for a benchmark model. We have called this measure the index of prediction accuracy (IPA). The IPA permits a common interpretation across binary, time to event, and competing risk outcomes. We illustrate this measure using example datasets.ResultsThe IPA is simple to compute, and example code is provided. The values of the IPA appear very interpretable.ConclusionsIPA should be a prominent measure reported in studies of medical prediction model performance. However, IPA is only a measure of average performance and, by default, does not measure the utility of a medical decision.

Highlights

  • Many measures of prediction accuracy have been developed

  • The purpose of this paper is to popularize a measure which scales the Brier score with the benchmark value, the index of prediction accuracy (IPA), and illustrate how it can be adapted to multiple settings when examining the performance of a statistical prediction model applied to a validation dataset

  • The largest drop in IPA was observed for the ERG status (6.3%)

Read more

Summary

Introduction

Many measures of prediction accuracy have been developed. the most popular ones in typical medical outcome prediction settings require additional investigation of calibration. Methods: We show how rescaling the Brier score produces a measure that combines discrimination and calibration in one value and improves interpretability by adjusting for a benchmark model We have called this measure the index of prediction accuracy (IPA). For reasons related to interpretability, the concordance statistics, including Harrell’s c-index [1, 2] and the area under the (time-dependent) ROC curve [3,4,5], have found dramatic popularity. They are quite intuitive and relatively easy to interpret, at least for pairs of subjects

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.