Estimating Policy Functions in Payment Systems Using Reinforcement Learning

Pablo Castro,Ajit Desai,Han Du,Rodney Garratt,Francisco Rivadeneyra

doi:10.1145/3691326

Pablo Castro, Ajit Desai + Show 3 more

https://doi.org/10.1145/3691326

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

This paper uses reinforcement learning (RL) to approximate the policy rules of banks participating in a high-value payment system (HVPS). The objective of the RL agents is to learn a policy function for the choice of amount of liquidity provided to the system at the beginning of the day and the rate at which to pay intraday payments. Individual choices have complex strategic effects precluding a closed form solution of the optimal policy, except in simple cases. We show that in a stylized two-agent setting, RL agents learn the optimal policy that minimizes the cost of processing their individual payments—without complete knowledge of the environment. We further demonstrate that in more complex settings, both agents learn to reduce the cost of processing their payments and effectively respond to liquidity-delay trade-off. Our results show the potential of RL to solve liquidity management problems in HVPS and provide new tools to assist policymakers in their mandates of ensuring safety and improving the efficiency of payment systems.

Full Text