Abstract

Credit risk assessment is critical for loan approval and risk management of banks. However, the problem of missing credit risk data may greatly reduce the effectiveness of the assessment model. Therefore, constructing a data imputation method for accurate missing data prediction is quite beneficial. Typically, building an effective imputation model is very challenging due to the high missing rate and complex arbitrary missing pattern of datasets in credit risk assessment. In this paper, a novel imputation method named as Multiple Generative Adversarial Imputation Networks (MGAIN) is proposed. Specifically, we first randomly select multiple attribute subsets instead of the whole attributes such that more complete samples can be generated. Then, the missing data in each attribute are imputed by using generative adversarial imputation networks (GAIN) which fully considers the relationships among missing values by combining neural network and adversarial learning. The proposed subset selection and multiple imputation strategy not only simplify the network structure of GAIN but also reduce the demand for data. Finally, a weighted average method is presented to synthesize multiple results of each missing attribute value to further improve the accuracy. The experimental results on real-world data demonstrate that the proposed method is superior to other popular imputation methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call