The management of financial risk is one of the most challenging tasks of financial institutions. In the last two decades, diverse quantitative models and approaches have been developed and refined to address the impact of volatile markets on business. Whereas existing approaches have intensively utilized structured data such as historical price series, little attention has been paid to unstructured (textual) data, which could be a large source of information in this context. Previous empirical research has shown that certain news stories, such as corporate disclosures, can cause abnormal price behavior subsequent to their publication. On the basis of a data set comprising such news stories as well as intraday stock prices, this paper explores the risk implications of information being newly available to market participants. After showing that such events can significantly drive stock price volatilities, this research aims at identifying among the textual data provided those disclosures that have resulted in most supranormal risk exposures. To this end, four different learners — Naïve Bayes, k-Nearest Neighbour, Neural Network, and Support Vector Machine — have been applied in order to detect patterns in the textual data that could explain increased risk exposure. Two evaluations are presented in order to assess the learning capabilities of the approach in the context of risk management. First, “classic” data mining evaluation metrics are applied and, second, a newly developed simulation-based evaluation method is presented. Evaluation results provide strong evidence that unstructured (textual) data represents a valuable source of information also for financial risk management — a domain in which, in the past, little attention has been paid to unstructured data. With regard to classification performance, it is also shown that there exist significant differences between the applied learning techniques.
Read full abstract