In this study, we propose algorithms to predict future stock market trends based on 8 different input features, including financial technology indicators, gold prices, a gold price volatility index, crude oil price, a crude oil price volatility index, and other characteristic data using two different labeling methods with separate classification algorithms of two and three output categories, respectively including predicted stock price changes (up and down) and recommended trading actions (buy, sell, and hold), and analyze the validity of these characteristic data in terms of their ability to predict future trends. The S&P 500 (GSPC) is the target of these forecasts. Sample data from 2010 to 2018 are divided 8:2, between training and verification data, while data from 2019 are used to test the proposed approach. CNN and LSTM models are used for comparison of classification accuracy and investment returns, respectively. Bayesian optimization (BO) hyperparameters are used to improve the accuracy of the model and increase the return on investment (ROI) of the output predictions.The purpose of this study is to verify whether using gold prices, a gold volatility index, crude oil price, and a crude oil price volatility indices as input features can enable a deep learning model accurately to predict future stock price trends, and to discuss separately the applicability of CNN and LSTM models to the abovementioned characteristics and financial indicators. We also present the results of experiments conducted to evaluate the proposed method in terms of classification accuracy and confusion matrix. In the case of three-category classification, the model takes feature data as input to outputs a predicted trading order on whether to buy, sell, or hold a given set of stocks tomorrow as well as the timing of entry and exit from each position, and also backtests the data outside the sample to find the combination of characteristics and indicators best maximizing ROI. Using this three-category method, we obtain a comprehensive ROI for a given set of individual stocks and assess whether each type of stock is suitable for the prediction model based on input features such as gold and crude oil or the fields that are suitable for the given feature.Experimental results show that the proposed approach as able to predict whether stock price will rise or fall in the next 10 days, and the model accuracy rate can reach 67%. The results of experiments on the proposed combined CNN model with eight features, referred to as CNN8, achieved an ROI on 2019 data outside the sample period of up to 13.23%, which was superior to the 12.08% and 11.06% obtained by the models designed CNN4 (CNN with four input features) and LSTM8(LSTM with eight input features), respectively. The F1 score increased from 0.75 0.79 as a result of applying BO. The results indicate that considering the price of gold, the gold volatility index, crude oil price, and crude oil price volatility index can help obtain better ROI for companies in certain fields, such as the semiconductor, petroleum, and automotive industries, rather than merely considering financial indicators. However, for companies related to apparel, fast food, and copy processing, the input characteristics of purely financial technical indicators were found to be suitable.
Read full abstract