Abstract

PurposeSince stock return and volatility matters to investors, this study proposes to incorporate the textual sentiment of annual reports in stock price crash risk prediction.Design/methodology/approachSpecific sentences gathered from management discussions and their subsequent analyses are tokenized and transformed into numeric vectors using textual mining techniques, and then the Naïve Bayes method is applied to score the sentiment, which is used as an input variable for crash risk prediction. The results are compared between a collection of predictive models, including linear regression (LR) and machine learning techniques.FindingsThe experimental results find that those predictive models that incorporate textual sentiment significantly outperform the baseline models with only accounting and market variables included. These conclusions hold when crash risk is proxied by either the negative skewness of the return distribution or down-to-up volatility (DUVOL).Research limitations/implicationsIt should be noted that the authors' study focuses on examining the predictive power of textual sentiment in crash risk prediction, while other dimensions of textual features such as readability and thematic contents are not considered. More analysis is needed to explore the predictive power of textual features from various dimensions, with the most recent sample data included in future studies.Originality/valueThe authors' study provides implications for the information value of textual data in financial analysis and risk management. It suggests that the soft information contained within annual reports may prove informative in crash risk prediction, and the incorporation of textual sentiment provides an incremental improvement in overall predictive performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call