Interpretable Binary and Multiclass Prediction Models for Insolvencies and Credit Ratings

Lennart Obermann

doi:10.53846/goediss-5688

Abstract

Insolvency prediction and credit rating are challenging tasks used to evaluate the creditworthiness of commercial enterprises based on qualitative and quantitative attributes. One way to approach these tasks is machine learning whereby prediction models are built from sample data. The advantage of machine learning is the automatization of the process obviating the need for human knowledge in most cases and thus, its high level of objectivity. Nevertheless, this approach does not claim to be perfect which is why it does not completely replace human knowledge. Since these models can be used as decision support for experts, interpretable models are desirable. Unfortunately, interpretable models are provided by only a few machine learners. Furthermore, some tasks in finance like credit rating often are multiclass problems. Multiclass classification is often achieved via meta-algorithms using multiple binary learners. However, most state-of-the-art meta-algorithms destroy the interpretability of binary models. In this thesis, we study the performance of interpretable models compared to non-interpretable models in insolvency prediction and credit rating. We look at disjunctive normal forms and decision trees of thresholds of financial ratios as interpretable models. We use random forests, artificial neural networks, and support vector machines as non-interpretable models. Furthermore, we use our own developed machine learning algorithm Thresholder to build disjunctive normal forms and interpretable multiclass models. For the task of insolvency prediction, we demonstrate that interpretable models are not inferior to non-interpretable black-box models. In a first case study, a real-life database with financial statements of 5152 enterprises is used to evaluate the performance for all models. In a second case study focused on credit rating, we show that interpretable multiclass models are even superior to non-interpretable multiclass models. We evaluate their performances on three real-life data sets divided into three rating classes. In these case studies, we compare different interpretable approaches concerning their model size and type of interpretability. We provide example models built on these real-life databases and an interpretation for them. The results show that interpretable threshold-based models are appropriate for classification problems in finance. For these tasks they are not inferior to more sophisticated models like support vector machines. Our algorithm Thresholder builds the smallest models while its performance is comparable to the other interpretable models. In our case study on credit rating, interpretable models perform better than for our case study on insolvency prediction. A possible explanation can be found in the nature of credit rating. In contrast to insolvencies, credit ratings are man-made. This implies that credit ratings are based on decisions by people thinking in interpretable rules, e.g., logical operations on thresholds. Thus, we assume that interpretable models match the problems and detect and represent these interpretable rules.

Full Text