Abstract Previous studies on the peer-to-peer (P2P) lending market have shown that borrowers’ default status is related to textual factors derived from the loan descriptions in loan applications. However, textual loan descriptions have not been fully explored in terms of their possible utility in credit risk evaluation models. In this study, we propose a new approach to construct a credit risk assessment model for the P2P lending market. This approach first utilizes a Transformer encoder to extract the textual features from the loan description, and then combines them with the hard features derived from the loan application; together, they comprise the final features of a loan. Finally the combined features are fed into a two-layer feed-forward neural network to predict the loan’s default probability. We perform empirical studies on two data sets of real transactions: LendingClub loan data from the American market and Renrendai loan data from the Chinese market. The results show that the model considering the textual loan description outperforms that which does not in terms of loan default prediction. Furthermore, the model is based on the Transformer encoder achieving the best performance under the AUC and G-mean metrics.
Read full abstract