Abstract

This paper analyzed and compared the forecast effect of three machine learning algorithms (multiple linear regression, random forest and LSTM network) in stock price forecast using the closing price data of NASDAQ ETF and data of statistical factors. The test results show that the prediction effect of the closing price data is better than that of statistical factors, but the difference is not significant. Multiple linear regression is most suitable for stock price forecast. The second is random forest, which is prone to overfitting. The forecast effect of LSTM network is the worst and the values of RMSE and MAPE were the highest. The forecast effect of future stock price using closing price of NASDAQ ETF is better than that using statistical factors, but the difference is not significant.

Highlights

  • Machine learning [1] is the general term for a class of algorithms that attempt to extract hidden information from historical data and solve the problems of prediction or classification

  • The forecast effect of multiple linear regression is between random forest and LSTM networks, the values of RMSE and MAPE are close to random forest

  • The forecast effect of random forest is between multiple linear regression and LSTM networks, the values of RMSE and MAPE are close to multiple linear regression

Read more

Summary

Introduction

Machine learning [1] is the general term for a class of algorithms that attempt to extract hidden information from historical data and solve the problems of prediction or classification. One is that researchers improved the original algorithm and try to prove that the prediction effect of the improved algorithm is better than that of original algorithm for the future stock price [2,3]. The other is researchers analyze and verify which algorithm is the most suitable for the forecast of future stock price based on compassion of several machine learning algorithms [4,5,6,7,8,9]. The most commonly used algorithms in the stock prediction are neural network algorithms that include multiple linear regression [10] and Long Short Term Memory [11], support vector machines [12] and random forest [13;14]. The remaining algorithms [15] are usually used as complementary algorithms to improve and optimize the above algorithms

Machine Learning Algorithms
The types of input data
Training model
Forecasting result of the test within the samples
Forecasting result of the test outside the samples
Conclusion
Findings
Statistical factors
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call