Abstract

BackgroundProbabilistic assessments of clinical care are essential for quality care. Yet, machine learning, which supports this care process has been limited to categorical results. To maximize its usefulness, it is important to find novel approaches that calibrate the ML output with a likelihood scale. Current state-of-the-art calibration methods are generally accurate and applicable to many ML models, but improved granularity and accuracy of such methods would increase the information available for clinical decision making.This novel non-parametric Bayesian approach is demonstrated on a variety of data sets, including simulated classifier outputs, biomedical data sets from the University of California, Irvine (UCI) Machine Learning Repository, and a clinical data set built to determine suicide risk from the language of emergency department patients.ResultsThe method is first demonstrated on support-vector machine (SVM) models, which generally produce well-behaved, well understood scores. The method produces calibrations that are comparable to the state-of-the-art Bayesian Binning in Quantiles (BBQ) method when the SVM models are able to effectively separate cases and controls. However, as the SVM models’ ability to discriminate classes decreases, our approach yields more granular and dynamic calibrated probabilities comparing to the BBQ method. Improvements in granularity and range are even more dramatic when the discrimination between the classes is artificially degraded by replacing the SVM model with an ad hoc k-means classifier.ConclusionsThe method allows both clinicians and patients to have a more nuanced view of the output of an ML model, allowing better decision making. The method is demonstrated on simulated data, various biomedical data sets and a clinical data set, to which diverse ML methods are applied. Trivially extending the method to (non-ML) clinical scores is also discussed.

Highlights

  • Probabilistic assessments of clinical care are essential for quality care

  • For the simulated data sets, reliability diagrams are constructed for various overlaps in the simulated Machine learning (ML) output distributions

  • The χ2p-values quantifying the goodness of fit to a slope of 1, the number of calibration points, and the range in the calibrated probabilities are averaged and plotted. (The χ2 is calculated by weighting the residuals by the inverse of the standard deviation of the calibrated probabilities)

Read more

Summary

Introduction

Probabilistic assessments of clinical care are essential for quality care. Yet, machine learning, which supports this care process has been limited to categorical results. Clinical decision support systems can be defined as any software designed to directly aid in clinical decision making in which characteristics of individual patients are matched to a computerized knowledge base for the purpose of generating patient-specific assessments or recommendations that are presented to clinicians for consideration [1, 2] They are important in the practice of medicine because they can improve practitioner performance [1, 3,4,5], Machine learning (ML) gives computers the ability to learn from, and make predictions on the data without being explicitly programmed regarding the characteristics of that data [17]. Healthcare data are complex - they can be distributed, structured, unstructured, incomplete, and not always generalizable.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.