Abstract

Using Reuters data and machine learning, this paper explores the role of linguistic features of business news headlines in bankruptcy prediction. This research identifies 34,921 firm-year observations from their headlines from 2003 to 2016, out of which there are 66 observations that eventually file for bankruptcy. For each observation in the full sample, 33 linguistic features are extracted from the news headlines, including the frequency of company mentions, 16 sentiment features, and the frequency of 16 bankruptcy-specific topics mentions. The 16 sentiment features are identified using three advanced natural language processing (NLP) tools: TextBlob, Flair, and VADER. The 16 bankruptcy-specific topics are considered best in distinguishing between bankruptcies and non-bankruptcies based on a decision tree approach. Besides the linguistic features, this paper follows prior research and employs 19 financial/market-based variables. Two tests are conducted: predicting bankruptcies (1) in the subsequent year and (2) in the subsequent two years. In each test, both the prediction accuracy of the machine learning models and the explanatory ability of the statistical model (i.e., the logistics model) are examined. For the first test, machine learning results show that both financial/market-based variables and news linguistic features provide strong bankruptcy prediction accuracy for the subsequent year (maximum AUC score is 0.965). Even with the news linguistic features alone, the prediction accuracy is sufficiently high (maximum AUC score is 0.791). Moreover, there is a significant incremental explanatory ability (p < 0.01) of the news linguistic features in the logistic regression. Similar results are found in the second test. There is also a significant incremental contribution of linguistic features in both prediction accuracy (p < 0.05) and explanatory ability (p < 0.01), suggesting that news headlines provide strong signals as early as two years prior to bankruptcy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call