Abstract

Algorithmic trading based on machine learning has the advantage of using intrinsic features and embedded causality in complex stock price time series. We propose a novel algorithmic trading model based on recurrent reinforcement learning, optimized for making consecutive trading signals. This paper elaborates on how temporal features from complex observation are optimally extracted to maximize the expected rewards of the reinforcement learning model. Our model incorporates the hybrid learning loss to allow sequences of hidden features for reinforcement learning to contain the original state’s characteristics fully. The self-attention mechanism is also introduced to our model for learning the temporal importance of the hidden representation series, which helps the reinforcement learning model to be aware of temporal dependence for its decision-making. In this paper, we verify the effectiveness of proposed model using some major market indices and the representative stocks in each sector of S&P500. The augmented structure that we propose has a significant dominance on trading performance. Our proposed model, self-attention based deep direct recurrent reinforcement learning with hybrid loss (SA-DDR-HL), shows superior performance over well-known baseline benchmark models, including machine learning and time series models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call