Predictive Fraud Analysis Applying the Fraud Triangle Theory through Data Mining Techniques

José Estrada-Jiménez,Luis Urquiza-Aguiar,Marco Sánchez-Aguayo

doi:10.3390/app12073382

José Estrada-Jiménez, Luis Urquiza-Aguiar + Show 1 more

Open Access

https://doi.org/10.3390/app12073382

Copy DOI

Journal: Applied Sciences	Publication Date: Mar 26, 2022
Citations: 7	License type: CC BY 4.0

Affiliation: National Polytechnic School

Abstract

Fraud is increasingly common, and so are the losses caused by this phenomenon. There is, thus, an essential economic incentive to study this problem, particularly fraud prevention. One barrier complicating the research in this direction is the lack of public data sets that embed fraudulent activities. In addition, although efforts have been made to detect fraud using machine learning, such actions have not considered the component of human behavior when detecting fraud. We propose a mechanism to detect potential fraud by analyzing human behavior within a data set in this work. This approach combines a predefined topic model and a supervised classifier to generate an alert from the possible fraud-related text. Potential fraud would be detected based on a model built from such a classifier. As a result of this work, a synthetic fraud-related data set is made. Four topics associated with the vertices of the fraud triangle theory are unveiled when assessing different topic modeling techniques. After benchmarking topic modeling techniques and supervised and deep learning classifiers, we find that LDA, random forest, and CNN have the best performance in this scenario. The results of our work suggest that our approach is feasible in practice since several such models obtain an average AUC higher than 0.8. Namely, the fraud triangle theory combined with topic modeling and linear classifiers could provide a promising framework for predictive fraud analysis.

Full Text