Abstract

There is often a significant class imbalance in credit scoring datasets, mainly in portfolios of secured loans such as mortgage loans. A class imbalance occurs when the number of non-default cases outweighs the number of default cases. A naive classifier can achieve high accuracy by assigning all cases to the majority class; however, misclassifying the minority class is often costly. In XGBoost, a well-known and robust classification method, we propose that the quantile function of the generalized extreme value (GEV) distribution is used as a link function to enhance the detection of rare cases. To complement the GEV link function, the study applies a modified focal loss function in XGBoost to jointly penalize misclassification of the class of interest and focus on hard, tricky to classify cases. We test our proposal on a vast database of mortgage loans with rare default cases, available on the Freddie Mac website. As benchmarks, we also consider other common large credit scoring databases, existing extensions of XGBoost to handle classification imbalance and other state-of-the-art classification techniques for learning class-imbalanced data. According to the results, the proposed model has a superior predictive power to other competing models if the class imbalance is due to default events being outliers or rare in the dataset. We also demonstrate that the results will likely hold up in real-world situations and add business value under certain portfolio characteristics.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.