Abstract

We consider the problem of execution timing in optimal execution. Specifically, we formulate the optimal execution problem of an infinitesimal order as an optimal stopping problem. By using a novel neural network architecture, we develop two versions of data-driven approaches for this problem, one based on supervised learning, and the other based on reinforcement learning. Temporal difference learning can be applied and extends these two methods to many variants. Through numerical experiments on historical market data, we demonstrate significant cost reduction of these methods. Insights from numerical experiments reveals various tradeoffs in the use of temporal difference learning, including convergence rates, data efficiency, and a tradeoff between bias and variance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call