Abstract

Default prediction identifies the probability of a firm to default by establishing a prediction model. It reveals the functional relation between the features’ data at time t-m and default status at t. If the prediction of a defaulting company is wrong, it will mislead banks into making loans to a “defaulter,” causing huge losses; if the prediction of a non-defaulting company is wrong, it will result in a potential churn in high-quality customers. To support the lending decisions of banks and non-banking financial institutions, this study proposes a two-stage default prediction model that integrates k-means clustering for partitioning the sample and support vector domain description (SVDD) for predicting default (credit scoring). It also uses attributes’ data at time t-m (m = 1, 2, 3, 4, 5) and the default status at t to train the proposed model so that it can warn of default m years ahead. The results show that the predictive accuracy of the proposed two-stage default prediction model is better than that of single-stage models using only k-means clustering or support vector domain description, and the proposed model could achieve a five-year default prediction ability (AUC > 0.85). Further, the study implies that “retained earnings/total assets”, “financial expenses/gross revenue”, and “type of audit opinion” are three key features in default forecasting for Chinese listed enterprises. This study contributes to the field of multi-stage credit scoring research by demonstrating that a combination of different methods is worth considering to improve the performance of default prediction models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call