Abstract

Healthcare cost predictions are widely used throughout the healthcare system. However, predicting these costs is complex because of both uncertainty and the complex interactions of multiple chronic diseases: chronic disease treatment decisions related to one condition are impacted by the presence of the other conditions. We propose a novel modeling approach inspired by backward elimination, designed to minimize information loss. Our approach is based on a cost hierarchy: the cost of each condition is modeled as a function of the number of other, more expensive chronic conditions the individual member has. Using this approach, we estimate the additive cost of chronic diseases and study their cost patterns. Using large-scale claims data collected from 2007 to 2012, we identify members that suffer from one or more chronic conditions and estimate their total 2012 healthcare expenditures. We apply regression analysis and clustering to characterize the cost patterns of 69 chronic conditions. We observe that the estimated cost of some conditions (for example, organic brain problem) decreases as the member’s number of more expensive chronic conditions increases. Other conditions, such as obesity and paralysis, demonstrate the opposite pattern; their contribution to the overall cost increases as the member’s number of other more serious chronic conditions increases. The modeling framework allows us to account for the complex interactions of multimorbidity and healthcare costs and, therefore, offers a deeper and more nuanced understanding of the cost burden of chronic conditions, which can be utilized by practitioners and policy makers to plan, design better intervention, and identify subpopulations that require additional resources. More broadly, our hierarchical model approach captures complex interactions and can be applied to improve decision making when the enumeration of all possible factor combinations is not possible, for example, in financial risk scoring and pay structure design. History: Rema Padman served as senior editor for this article. Data Ethics & Reproducibility Note: This study is based on proprietary deidentified insurance claims data, so it is not possible to share the original data. To assist in reproducibility, the complete output of the model and statistics related to the cost and prevalence of the conditions studied as well as the diagnosis codes used are included in the online supplement. The modeling approach in this study utilizes healthcare costs as a proxy for severity, which can cause racial disparities. We discuss this in more detail in the Discussion section. The research plan for this study was approved by the institutional review board at the University of Maryland College Park on April 28, 2020. The code capsule is available on Code Ocean at https://doi.org/10.24433/CO.6703019.v1 and https://doi.org/10.24433/CO.1745085.v1 and in the e-companion to this article (available at https://doi.org/10.1287/ijds.2022.0010 ).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call