Predicting Corporate Credit Ratings Using Content Analysis of Annual Reports – A Naïve Bayesian Network Approach

Petr Hajek,Vladimir Olej,Ondrej Prochazka

doi:10.1007/978-3-319-52764-2_4

Abstract

Corporate credit ratings are based on a variety of information, including financial statements, annual reports, management interviews, etc. Financial indicators are critical to evaluate corporate creditworthiness. However, little is known about how qualitative information hidden in firm-related documents manifests in credit rating process. To address this issue, this study aims to develop a methodology for extracting topical content from firm-related documents using latent semantic analysis. This information is integrated with traditional financial indicators into a multi-class corporate credit rating prediction model. Informative indicators are obtained using a correlation-based filter in the process of feature selection. We demonstrate that Naive Bayesian networks perform statistically equivalent to other machine learning methods in terms of classification performance. We further show that the “red flag” values obtained using Naive Bayesian networks may indicate a low credit quality (non-investment rating classes) of firms. These findings can be particularly important for investors, banks and market regulators.

Full Text