Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data

R Florez-Lopez

doi:10.1057/jors.2009.66

Abstract

The 2004 Basel II Accord has pointed out the benefits of credit risk management through internal models using internal data to estimate risk components: probability of default (PD), loss given default, exposure at default and maturity. Internal data are the primary data source for PD estimates; banks are permitted to use statistical default prediction models to estimate the borrowers’ PD, subject to some requirements concerning accuracy, completeness and appropriateness of data. However, in practice, internal records are usually incomplete or do not contain adequate history to estimate the PD. Current missing data are critical with regard to low default portfolios, characterised by inadequate default records, making it difficult to design statistically significant prediction models. Several methods might be used to deal with missing data such as list-wise deletion, application-specific list-wise deletion, substitution techniques or imputation models (simple and multiple variants). List-wise deletion is an easy-to-use method widely applied by social scientists, but it loses substantial data and reduces the diversity of information resulting in a bias in the model's parameters, results and inferences. The choice of the best method to solve the missing data problem largely depends on the nature of missing values (MCAR, MAR and MNAR processes) but there is a lack of empirical analysis about their effect on credit risk that limits the validity of resulting models. In this paper, we analyse the nature and effects of missing data in credit risk modelling (MCAR, MAR and NMAR processes) and take into account current scarce data set on consumer borrowers, which include different percents and distributions of missing data. The findings are used to analyse the performance of several methods for dealing with missing data such as likewise deletion, simple imputation methods, MLE models and advanced multiple imputation (MI) alternatives based on MarkovChain-MonteCarlo and re-sampling methods. Results are evaluated and discussed between models in terms of robustness, accuracy and complexity. In particular, MI models are found to provide very valuable solutions with regard to credit risk missing data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data

Abstract

Talk to us

Similar Papers

More From: Journal of the Operational Research Society

Lead the way for us

Journal: Journal of the Operational Research Society	Publication Date: Mar 1, 2010
Citations: 35

Similar Papers

Comparison of statistical approaches for analyzing incomplete longitudinal patient-reported outcome data in randomized controlled trials
Ines Rombach ... David Murray
Patient Related Outcome Measures | VOL. Volume 9
Ines Rombach, et. al.Ines Rombach ... David Murray
01 Jun 2018
Patient Related Outcome Measures | VOL. Volume 9

CT-06 Missing outcomes in SLE clinical trials: impact on estimating treatment effects
Mimi Kim ... Leslie Hanrahan
-
Mimi Kim, et. al.Mimi Kim ... Leslie Hanrahan
01 Aug 2018
01 Aug 2018

Default recovery rates and LGD in credit risk modelling and practice: An updated review of the literature and empirical evidence
Edward I Altman
-
Edward I AltmanEdward I Altman
25 Sep 2008
25 Sep 2008

1215Considerations for using multiple imputation in propensity score-weighted analysis
Andreas Halgreen Eiset ... Morten Frydenberg
International Journal of Epidemiology | VOL. 50
Andreas Halgreen Eiset, et. al.Andreas Halgreen Eiset ... Morten Frydenberg
01 Sep 2021
1215Considerations for using multiple imputation in propensity score-weighted analysis
Andreas Halgreen Eiset ... Morten Frydenberg

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data

Abstract

Talk to us

Similar Papers

More From: Journal of the Operational Research Society