Abstract

Stock return prediction has been a hot topic in both research and industry given its potential for large financial gain. The return signal, apart from its inherent volatility and complexity, is often accompanied by a multitude of noises, such as other stocks’ performance, macroeconomic factors and financial news, etc. To better characterize these factors, we propose a new model that consists of two levels of sequence: an NLP-based module to capture the sequential nature of words and sentences in the financial news, and a time-series-based module to exploit the sequential nature of adjacent observations in the stock price. In this proposed framework, we employ Hierarchical Attention Networks (HAN) in the text mining module, which could effectively model the financial news and extract important signals at both word and sentence level. For the time series module, the established Long-Short Term Memory (LSTM) network is used to model the complex serial dependence in the time series data. We compare with benchmark models using either module alone, as well as other alternatives using the traditional Bag of Words (BOW) approach, based on the Dow Jones Industrial Average (DJIA) dataset. Experiment results show that our proposal method performs better in several classification metrics for both positive and negative stock returns.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call