Abstract

We employ multiple heterogeneous data sources, including historical transaction data, technical indicators, stock posts, news and Baidu index, to predict the directions of stock price movements. We focus on the distinctive predicting patterns of active and inactive stocks, and we examine the predictive power of support vector machine (SVM) in different levels of activity for a single stock. We construct a total of 14 data source combinations according to the above 5 heterogeneous data sources, and choose three forecasting horizons, namely 1 day, 2 days and 3 days, so that we can investigate the forecast effects of stock price movements in China A-share market under different data source combinations and forecasting horizons. It is concluded that the optimal data source combinations of active and inactive stocks are different. Active stocks achieve the highest accuracy when combining multiple non-traditional data sources, while inactive stocks obtain the highest accuracy when combining traditional data sources with non-traditional data sources. We further divide each stock into inactive periods, active periods and very active periods, and compare the forecast effects of the same stocks in different periods. We conclude that, for most combinations of data sources, the more active the stock is, the more accurate we achieve, which indicates that our approach is more powerful for predicting the price movements of stocks in active and very active periods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call