Abstract

The development and application of computational data mining techniques in financial fraud detection and business failure prediction has become a popular cross-disciplinary research area in recent times involving financial economists, forensic accountants and computational modellers. Some of the computational techniques popularly used in the context of financial fraud detection and business failure prediction can also be effectively applied in the detection of fraudulent insurance claims and therefore, can be of immense practical value to the insurance industry. We provide a comparative analysis of prediction performance of a battery of data mining techniques using real-life automotive insurance fraud data. While the data we have used in our paper is US-based, the computational techniques we have tested can be adapted and generally applied to detect similar insurance frauds in other countries as well where an organized automotive insurance industry exists.

Highlights

  • The annual cost of settlements from fraudulent insurance claims in Australia was estimated at $1.4 billion dollars in 1997, which added $70 to the annual premium of each insurance policy (Baldock, 1997)

  • logit analysis (LA) had slightly superior classification accuracy, but all four models performed comparably with near 70% overall accuracy. These results support further testing of Cox and See5 models as automotive insurance classifiers, as Decision trees (DTs) and Survival analysis (SA) models are known for being sensitive to training datasets

  • Analysis on a much larger dataset is desirable. This would allow for hold-out sample tests as indicated by Wilson (2009). These hold-out sample tests should have more realistic proportions of fraudulent/legitimate claims rather than the synthetic even split in the data used here

Read more

Summary

Introduction

The annual cost of settlements from fraudulent insurance claims in Australia was estimated at $1.4 billion dollars in 1997, which added $70 to the annual premium of each insurance policy (Baldock, 1997). These figures are likely to be much higher today as fraud is a growing problem (Morley et al, 2006). Automated statistical techniques for insurance fraud are designed to assist detection of fraudulent claims in a time efficient manner If successful, this would reduce the costs of fraud outlined in the previous paragraph. After that, concluding remarks on the problem as well as the methods are noted to round off the paper

Some Issues for Statistical Models in the Field
Introduction to the Research Field
Survival Analysis
Decision Trees
Hybrid Models
Methodology
Findings
Concluding Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call