Machine learning algorithms for fraud prediction in property insurance: Empirical evidence using real-world microdata

Matheus Kempa Severino,Yaohao Peng

doi:10.1016/j.mlwa.2021.100074

Matheus Kempa Severino, Yaohao Peng

Open Access

https://doi.org/10.1016/j.mlwa.2021.100074

Copy DOI

Journal: Machine Learning with Applications	Publication Date: Jun 22, 2021
Citations: 26	License type: cc-by-nc-nd

Affiliation: Universidade de Brasília

Abstract

This paper evaluated fraud prediction in property insurance claims using various machine learning models based on real-world data from a major Brazilian insurance company. The models were tested recursively and average predictive results were compared controlling for false positives and false negatives. The results showed that ensemble-based methods (random forest and gradient boosting) and deep neural networks yielded the best results, exhibiting superior average performance in comparison to the other classifiers, including the commonly used logistic regression. In addition, we compiled a general profile of confirmed fraudsters from the dataset and estimated the impact of each feature in the global classification performance and for prominent cases of false positive and false negative predictions using eXplainable Artificial Intelligence methods. The findings of this study can aid risk analysts and professionals in assessing the strengths and weaknesses of each model and to build empirically effective decision rules to evaluate future insurance policies.

Full Text