Abstract
Predictive models of health care costs have become mainstream in much health care actuarial work. The Affordable Care Act requires the use of predictive modeling-based risk-adjuster models to transfer revenue between different health exchange participants. Although the predictive accuracy of these models has been investigated in a number of studies, the accuracy and use of models for applications other than risk adjustment have not been the subject of much investigation. We investigate predictive modeling of future health care costs using several statistical techniques. Our analysis was performed based on a dataset of 30,000 insureds containing claims information from two contiguous years. The dataset contains more than 100 covariates for each insured, including detailed breakdown of past costs and causes encoded via coexisting condition flags. We discuss statistical models for the relationship between next-year costs and medical and cost information to predict the mean and quantiles of future cost, ranking risks and identifying most predictive covariates. A comparison of multiple models is presented, including (in addition to the traditional linear regression model underlying risk adjusters) Lasso GLM, multivariate adaptive regression splines, random forests, decision trees, and boosted trees. A detailed performance analysis shows that the traditional regression approach does not perform well and that more accurate models are possible.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.