Abstract

In recent years, online lending platforms have been becoming attractive for micro-financing and popular in financial industries. However, such online lending platforms face a high risk of failure due to the lack of expertise on borrowers' creditworthness. Thus, risk forecasting is important to avoid economic loss. Detecting loan fraud users in advance is at the heart of risk forecasting. The purpose of fraud user (borrower) detection is to predict whether one user will fail to make required payments in the future. Detecting fraud users depend on historical loan records. However, a large proportion of users lack such information, especially for new users. In this paper, we attempt to detect loan fraud users from cross domain heterogeneous data views, including user attributes, installed app lists, app installation behaviors, and app-in logs, which compensate for the lack of historical loan records. However, it is difficult to effectively fuse the multiple heterogeneous data views. Moreover, some samples miss one or even more data views, increasing the difficulty in fusion. To address the challenges, we propose a novel end-to-end deep multiview learning approach, which encodes heterogeneous data views into homogeneous ones, generates the missing views based on the learned relationship among all the views, and then fuses all the views together to a comprehensive view for identifying fraud users. Our model is evaluated on a real-world large-scale dataset consisting of 401,978 loan records of 228,117 users from January 1, 2019, to September 30, 2019, achieving the state-of-the-art performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call