Identification of high-risk beneficiaries in private healthcare insurance.

Adauto Santos,Gislaine Camila Lapasini Leal,Renato Balancieri

doi:10.1177/14604582241230384

Abstract

The objective of this study was to apply the Knowledge Discovery in Databases process to find out if beneficiaries of a private healthcare insurance would belong, at least once, to the 'very high cost' and 'complex cases' groups throughout the 12 months after the month when algorithms were applied. Datasets were built containing information on beneficiaries' effective use of their health plan, as well as their characteristics. Five machine learning algorithms were used, namely Random forest, Extra tree, Xgboost, Naive bayes and K-nearest neighbor. The K-nearest neighbor algorithm had a recall rate of 81.12%, 83.77% precision and an Area Under the Curve (AUC) value of 0.9045. The study also revealed that categorization occurs, on average, 8.11 months before a beneficiary entering, for the first time, a high-risk group, considering the dataset classification from January 2019 to June 2020.

Full Text