Abstract

ObjectiveThe objective of this study was to compare the performance of several commonly used machine learning methods to traditional statistical methods for predicting emergency department and hospital utilization among patients receiving publicly-funded home care services.Study design and settingWe conducted a population-based retrospective cohort study of publicly-funded home care recipients in the Hamilton-Niagara-Haldimand-Brant region of southern Ontario, Canada between 2014 and 2016. Gradient boosted trees, neural networks, and random forests were tested against two variations of logistic regression for predicting three outcomes related to emergency department and hospital utilization within six months of a comprehensive home care clinical assessment. Models were trained on data from years 2014 and 2015 and tested on data from 2016. Performance was compared using logarithmic score, Brier score, AUC, and diagnostic accuracy measures.ResultsGradient boosted trees achieved the best performance on all three outcomes. Gradient boosted trees provided small but statistically significant performance gains over both traditional methods on all three outcomes, while neural networks significantly outperformed logistic regression on two of three outcomes. However, sensitivity and specificity gains from using gradient boosted trees over logistic regression were only in the range of 1%-2% at several classification thresholds.ConclusionGradient boosted trees and simple neural networks yielded small performance benefits over logistic regression for predicting emergency department and hospital utilization among patients receiving publicly-funded home care. However, the performance benefits were of negligible clinical importance.

Highlights

  • Risk prediction models are commonly used across clinical practice for case-finding, triaging, and to inform clinical decision-making and care planning

  • There has been considerable interest in recent years in using machine learning approaches to improve clinical risk prediction, with examples published in fields such as cardiology, rheumatology, oncology, and perioperative care [3,4,5,6]

  • While conventional logistic regression approaches often involve the selection of predictors based on expert knowledge or p-value thresholds, the models used in this study were not determined by a model building process but included all predictors irrespective of statistical significance or theoretical relevance

Read more

Summary

Introduction

Risk prediction models are commonly used across clinical practice for case-finding, triaging, and to inform clinical decision-making and care planning. Prognostic models have traditionally been derived using conventional statistical methods such as multivariable logistic regression. These classical approaches come with additivity and linearity assumptions which aid in the interpretability of the model but may represent first-order approximations of the true underlying relationships [1]. There are numerous algorithms from the machine learning and data mining literature that have been developed for prediction. While these methods provide predictions only, rather than an interpretable model, they are considerably more flexible than traditional methods and can better account for non-linearities and interaction effects in predictors [2]. While some studies have shown that machine learning methods offer significant performance improvements [7,8], some have found little difference [9], and others have concluded that traditional statistical approaches provide the best performance in some cases [10,11]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.