Abstract

Credit risk has been a widespread and deep penetrating problem for centuries, but not until various credit derivatives and products were developed and novel technologies began radically changing the human society, have fraud detection, credit scoring and other risk management systems become so important not only to some specific firms, but to industries and governments worldwide. Frauds and unpredictable defaults cost billions of dollars each year, thus, forcing financial institutions to continuously improve their systems for loss reduction. In the past twenty years, amounts of studies have proposed the use of data mining techniques to detect frauds, score credits and manage risks, but issues such as data selection, algorithm design, and hyperparameter optimization affect the perceived ability of the proposed solutions and it is difficult for auditors and researchers to explore and figure out the highest level of general development in this area. In this survey we focus on a state of the art survey of recently developed data mining techniques for fraud detection and credit scoring. Several outstanding experiments are recorded and highlighted, and the corresponding techniques, which are mostly based on supervised learning algorithms, unsupervised learning algorithms, semisupervised algorithms, ensemble learning, transfer learning, or some hybrid ideas are explained and analysed. The goal of this paper is to provide a dense review of up-to-date techniques for fraud detection and credit scoring, a general analysis on the results achieved and upcoming challenges for further researches.

Highlights

  • Frauds and unpredictable defaults cost billions of dollars each year, forcing financial institutions to continuously improve their systems for loss reduction and, consequentially, fraud detection and credit scoring became hot spots to explore and, in the past twenty years, a large amount of studies have proposed the use of novel data mining techniques for fraud detection, credit scoring and risk management

  • This paper reviewed the literature describing use of the fraud detection and credit scoring approaches based on supervised, unsupervised, semi-supervised, ensemble and transfer techniques

  • It is noticed that most fraud detection systems employ at least one supervised learning method

Read more

Summary

Introduction

Credit risk has been a widespread and deep penetrating problem for centuries, but not until various credit derivatives and products were developed and novel technologies began radically changing the human society, have fraud detection, credit scoring and other risk management systems become so important to some specific firms, but to industries and governments worldwide. The explosive growth of China’s credit market provides opportunities for related organizations to make profit, for customers to get bland new services and products, for fraudsters to hunt for unauthorized benefits, and for researchers to design intelligent fraud detection and credit scoring systems. Issues such as data selection, algorithm design, and hyperparameter optimization affect the perceived ability of the proposed solutions and it is difficult for auditors and researchers to explore and figure out the highest level of general development in this area.

Related works
Classification of data mining techniques and applications
Decision tree
Logistic regression
Support vector machine
Artificial neural network
K-means
Graph based semi-supervised learning
Automobile insurance fraud detection
Financial statement fraud detection
Credit card fraud detection
P2P lending fraud detection and credit scoring
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.