Abstract
This paper analyzed and compared the forecast effect of three machine learning algorithms (multiple linear regression, random forest and LSTM network) in stock price forecast using the closing price data of NASDAQ ETF and data of statistical factors. The test results show that the prediction effect of the closing price data is better than that of statistical factors, but the difference is not significant. Multiple linear regression is most suitable for stock price forecast. The second is random forest, which is prone to overfitting. The forecast effect of LSTM network is the worst and the values of RMSE and MAPE were the highest. The forecast effect of future stock price using closing price of NASDAQ ETF is better than that using statistical factors, but the difference is not significant.
Highlights
Machine learning [1] is the general term for a class of algorithms that attempt to extract hidden information from historical data and solve the problems of prediction or classification
The forecast effect of multiple linear regression is between random forest and LSTM networks, the values of RMSE and MAPE are close to random forest
The forecast effect of random forest is between multiple linear regression and LSTM networks, the values of RMSE and MAPE are close to multiple linear regression
Summary
Machine learning [1] is the general term for a class of algorithms that attempt to extract hidden information from historical data and solve the problems of prediction or classification. One is that researchers improved the original algorithm and try to prove that the prediction effect of the improved algorithm is better than that of original algorithm for the future stock price [2,3]. The other is researchers analyze and verify which algorithm is the most suitable for the forecast of future stock price based on compassion of several machine learning algorithms [4,5,6,7,8,9]. The most commonly used algorithms in the stock prediction are neural network algorithms that include multiple linear regression [10] and Long Short Term Memory [11], support vector machines [12] and random forest [13;14]. The remaining algorithms [15] are usually used as complementary algorithms to improve and optimize the above algorithms
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have