Forecasting changes in stock prices is extremely challenging given that numerous factors cause these prices to fluctuate. The random walk hypothesis and efficient market hypothesis essentially state that it is not possible to systematically, reliably predict future stock prices or forecast changes in the stock market overall. Nonetheless, machine learning (ML) techniques that use historical data have been applied to make such predictions. Previous studies focused on a small number of stocks and claimed success with limited statistical confidence. In this study, we construct feature vectors composed of multiple previous relative returns and apply the random forest (RF), support vector machine (SVM), and long short-term memory (LSTM) ML methods as classifiers to predict whether a stock can return 2% more than its index in the following 10 days. We apply this approach to all S&P 500 companies for the period 2017–2022. We assess performance using accuracy, precision, and recall and compare our results with a random choice strategy. We observe that the LSTM classifier outperforms RF and SVM, and the data-driven ML methods outperform the random choice classifier (p = 8.46e−17 for accuracy of LSTM). Thus, we demonstrate that the probability that the random walk and efficient market hypotheses hold in the considered context is negligibly small.
Read full abstract