Abstract

This paper proposes generalized deep reinforcement learning with multivariate state space, discrete rewards, and adaptive synchronization for trading any stock held in the S&P 500. Specifically, the proposed trading model observes the daily historical data of a stock held in the S&P 500 and multiple market-indicating securities (SPY, IEF, EUR=X, GSG), selects a trading action, and observes a discrete reward that is based on the correctness of the selected action and independent of the volatility of stocks. The proposed trading model’s reward-maximizing behavior is optimized by using a standard deep q-network (DQN) with adaptive synchronization that stabilizes and enables to track learning performance on generalizing new experiences from each stock. The proposed trading model was trained on the top 50 holdings of the S&P 500 and tested on the top 100 holdings of the S&P 500 starting from 2006 to 2022. Experimental results suggest that the proposed trading model significantly outperforms the 100% long-strategy benchmark in terms of annualized return, Sharpe ratio, and maximum drawdown.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call