Abstract
Financial fraud detection is one of the core technological assets of Fintech companies. It saves tens of millions of money from Chinese Fintech companies since the bad loan rate is more than 10%. HC Financial Service Group is the 3rd largest company in the Chinese P2P financial market. In this paper we illustrate how we tackle the fraud detection problem at HC Financial. We utilize two powerful workhorses in the machine learning field - random forest and gradient boosting decision tree to detect fraudulent users. We demonstrate that by carefully select features and tune model parameters, we could effectively filter out fraudulent users in the P2P market.
Highlights
Fintech is one of the most thriving industry in many countries over the world
People have been relying on Fintech to lend and borrow money, detect fraudulent users, match loans between money lenders and borrowers
Knowledge graph is one of the core assets of P2P companies because it is highly useful in crucial business processes within the company
Summary
Fintech is one of the most thriving industry in many countries over the world. People have been relying on Fintech to lend and borrow money , detect fraudulent users , match loans between money lenders and borrowers. Major P2P companies in China have been using a technology called knowledge graph to facilitate their financial processes. Each node in knowledge graph represents entity such as person, id card number, address etc. We have a fully built team consisting of nearly 100 staff working on credit risk modeling problems taking advantage of more than 400 million users’ information. Case-by-case and manual inspection of individual node and its neighbors in extremely large knowledge graph is daily routine of credit risk modeling team’s work. One advantage of the problem setting in P2P financial market is that the fraud rate is very high - as high as more than 10% , some times a lot higher. The high fraud rate causes big headache for company runners but saves the day for algorithm engineers since class imbalance problem is a lot less severe. We demonstrate that using random forest and gradient boosting decision tree, we could obtain evaluation metrics comparable to non-class imbalance problems
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.