Credit Risk Modeling with Graph Machine Learning

Sanjiv Das,Xin Huang,Soji Adeshina,Patrick Yang,Leonardo Bachega

doi:10.1287/ijds.2022.00018

Abstract

Accurate credit ratings are an essential ingredient in the decision-making process for investors, rating agencies, bond portfolio managers, bankers, and policy makers, as well as an important input for risk management and regulation. Credit ratings are traditionally generated from models that use financial statement data and market data, which are tabular (numeric and categorical). Using machine learning methods, we construct a network of firms using U.S. Securities and Exchange Commission (SEC) filings (denoted CorpNet) to enhance the traditional tabular data set with a corporate graph. We show that this generates accurate rating predictions with comparable and better performance to tabular models. We ensemble graph convolutional networks with highly-performant ensembled machine learning models using AutoGluon. This paper demonstrates both transductive and inductive methodologies to extend credit scoring models based on tabular data, which have been used by the ratings industry for decades, to the class of machine learning models on networks. The methodology is extensible to other financial machine learning models that may be enhanced using a corporate graph. History: David Martens served as the senior editor for this article. Data Ethics & Reproducibility Note: No data ethics considerations are foreseen related to this article. The paper deals with corporate credit risk and not consumer credit, which usually entails issues around privacy and bias. The code capsule is available on Code Ocean at https://codeocean.com/capsule/5230264/tree/v2 and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2022.00018 ).

Full Text