Deep reinforcement learning with the confusion-matrix-based dynamic reward function for customer credit scoring

Yadong Wang,Yanlin Jia,Yuhang Tian,Jin Xiao

doi:10.1016/j.eswa.2022.117013

Abstract

Customer credit scoring is a dynamic interactive process. Simply designing the static reward function for deep reinforcement learning may be difficult to guide an agent to adapt to the change of the customer credit scoring environment. To solve this problem, we propose the deep Q-network with the confusion-matrix-based dynamic reward function (DQN-CMDRF) model. Especially, the new constructed dynamic reward function can adjust the reward dynamically according to the change of confusion matrix after each deep Q-network model training, which can guide the agent to adapt to the change of environment quickly, so as to improve the customer credit scoring performance of the deep Q-network model. First, we formulate customer credit scoring as a finite Markov decision process. Second, to adjust the reward dynamically according to the customer credit scoring environment, the dynamic reward function is designed based on the confusion matrix. Finally, we introduce the confusion-matrix-based dynamic reward function into the deep Q-network model for customer credit scoring. To verify the effectiveness of the proposed model, we introduce four evaluation measures and make a series of experiments on the five customer credit scoring datasets. The experimental results show that the constructed dynamic reward function can more effectively improve customer credit scoring performance of the deep Q-network model, and the performance of the DQN-CMDRF model is significantly better than that of the other eight traditional classification models. More importantly, we find that the constructed dynamic reward function can accelerate the convergence speed and improve the stability of the deep Q-network model.

Full Text