Abstract

To solve the high-dimensionality issue and improve its accuracy in credit risk assessment, a high-dimensionality-trait-driven learning paradigm is proposed for feature extraction and classifier selection. The proposed paradigm consists of three main stages: categorization of high dimensional data, high-dimensionality-trait-driven feature extraction, and high-dimensionality-trait-driven classifier selection. In the first stage, according to the definition of high-dimensionality and the relationship between sample size and feature dimensions, the high-dimensionality traits of credit dataset are further categorized into two types: 100 < feature dimensions < sample size, and feature dimensions ≥ sample size. In the second stage, some typical feature extraction methods are tested regarding the two categories of high dimensionality. In the final stage, four types of classifiers are performed to evaluate credit risk considering different high-dimensionality traits. For the purpose of illustration and verification, credit classification experiments are performed on two publicly available credit risk datasets, and the results show that the proposed high-dimensionality-trait-driven learning paradigm for feature extraction and classifier selection is effective in handling high-dimensional credit classification issues and improving credit classification accuracy relative to the benchmark models listed in this study.

Highlights

  • Credit risk classification has always been a hot issue in scientific research, especially in the context of globalization (Ma and Wang 2020), where it has become increasingly important in the field of financial risk management

  • Methodology formulation a high-dimensionality-trait-driven learning paradigm for feature extraction and classifier selection is proposed for the high-dimensional credit risk classification problem

  • When 100 < feature dimensions < sample size, principal component analysis (PCA) is selected as the feature extraction strategy, and single linear classifier is selected as the classification model, according to the high dimensionality trait

Read more

Summary

Introduction

Credit risk classification has always been a hot issue in scientific research, especially in the context of globalization (Ma and Wang 2020), where it has become increasingly important in the field of financial risk management. The problem of sparse data and complicated calculation caused by too many features is known as the curse of dimensionality. This kind of high-dimensionality problem becomes important in credit risk classification. It increases the cost of credit classification and the calculation time exponentially; the accuracy of classification will decline (Kou et al 2020). Credit risk classification of high-dimensional features has become a challenging task. Many research achievements have been made in the past decades, there are still many problems and challenges to be solved in this field

Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call