Financial fraud presents itself in various forms, and it often involves intricate financial transaction networks, making it challenging to detect the perpetrators and identify the characteristics of the fraud. In recent years, machine learning has gained widespread application within the financial sector. Therefore, various financial fraud detection models have been developed based on diverse machine learning methodologies. In the method part, this paper provides an overview of the machine learning process and then discusses the application of machine learning models in various financial fraud scenarios. Financial fraud in the insurance field has been further refined into automobile and medical insurance fraud. In automobile insurance fraud detection, some studies applied implicit Naive Bayes Model to analyze observable features and estimate hidden variables, and some studies used resampler to solve data imbalance and adopted 7 kinds of machine learning models for analysis. In health insurance fraud detection, many studies train medical data with multiple models, including AdaBoost, Logistic Regression and Support Vector Machine. In credit card fraud detection, studies use five algorithms, including random forest and decision tree et al., and some construct a model combining Decision Tree (DT) and Logistic Regression (LR). In bank fraud detection, some studies introduce Value-at-Risk to combine it with machine learning algorithms, and some studies propose a decentralized model training method based on federated learning. In the discussion section, this paper addresses the current limitations of the research, such as poor interpretability, uneven distribution of data sets, and issues related to customer privacy, and proposes corresponding solutions.
Read full abstract