Improving the Prediction of Persistent High Health Care Utilizers: Retrospective Analysis Using Ensemble Methodology.

Stephanie N Howson,Raghav Ramachandran,Hsien-Yen Chang,Hadi Kharrazi,Michael J Mcshea,Howard S Burkom,Jonathan P Weiner

doi:10.2196/33212

Abstract

BackgroundA small proportion of high-need patients persistently use the bulk of health care services and incur disproportionate costs. Population health management (PHM) programs often refer to these patients as persistent high utilizers (PHUs). Accurate PHU prediction enables PHM programs to better align scarce health care resources with high-need PHUs while generally improving outcomes. While prior research in PHU prediction has shown promise, traditional regression methods used in these studies have yielded limited accuracy.ObjectiveWe are seeking to improve PHU predictions with an ensemble approach in a retrospective observational study design using insurance claim records.MethodsWe defined a PHU as a patient with health care costs in the top 20% of all patients for 4 consecutive 6-month periods. We used 2013 claims data to predict PHU status in next 24 months. Our study population included 165,595 patients in the Johns Hopkins Health Care plan, with 8359 (5.1%) patients identified as PHUs in 2014 and 2015. We assessed the performance of several standalone machine learning methods and then an ensemble approach combining multiple models.ResultsThe candidate ensemble with complement naïve Bayes and random forest layers produced increased sensitivity and positive predictive value (PPV; 49.0% and 50.3%, respectively) compared to logistic regression (46.8% and 46.1%, respectively).ConclusionsOur results suggest that ensemble machine learning can improve prediction of care management needs. Improved PPV implies reduced incorrect referral of low-risk patients. With the improved sensitivity/PPV balance of this approach, resources may be directed more efficiently to patients needing them most.

Full Text