Abstract

This paper aims to enhance credit risk assessment for non-financial companies in Romania by developing a machine learning (ML) model to estimate the probability of default. Utilizing an extensive set of microeconomic data, including financial statements, loan-level data from the Credit Risk Register, shareholder structure, export and import activities, and external debt, the model provides a comprehensive analysis of a company’s financial health and risk profile. The ML model employs logistic regression for classification, with 80% of the data used for training and 20% for validation. The model’s performance was evaluated using the receiver operating characteristic curve and confusion matrix, demonstrating an accuracy of 88%. Further validation through point-in-time estimation confirmed the model’s stability. The study is limited by the relatively low number of defaulting companies in the sample and the unique economic disruptions of 2020 due to the COVID-19 pandemic. To account for these factors, a Random Under Sample Boosted Trees approach is employed, which improves the model’s ability to distinguish between defaulted and non-defaulted debtors. Despite these limitations, the research concludes that integrating extensive financial data and advanced ML techniques have the potential to markedly enhance credit risk assessment, providing a reliable tool for financial institutions to manage credit risk effectively. Future improvements could address data imbalance and incorporate more diverse economic conditions to enhance predictive power for defaulting companies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.