Abstract

Reinforcement Learning (RL) has achieved significant success in application domains such as robotics, games and health care. However, training RL agents is very time consuming. Current implementations exhibit poor performance due to challenges such as irregular memory accesses and thread-level synchronization overheads on CPU. In this work, we propose a framework for generating scalable reinforcement learning implementations on multi-core systems. Replay Buffer is a key component of RL algorithms which facilitates storage of samples obtained from environmental interactions and data sampling for the learning process. We define a new data structure for Prioritized Replay Buffer based on <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K$</tex> -ary sum tree that supports asynchronous parallel insertions, sampling, and priority updates. To address the challenge of irregular memory accesses, we propose a novel data layout to store the nodes of the sum tree that reduces the number of cache misses. Additionally, we propose lazy writing mechanism to reduce thread-level synchronization over-heads of the Replay Buffer operations. Our framework employs parallel actors to concurrently collect data via environmental interactions, and parallel learners to perform stochastic gradient descent using the collected data. Our framework supports a wide range of reinforcement learning algorithms including DQN, DDPG, etc. We demonstrate the effectiveness of our framework in accelerating RL algorithms by performing experiments on CPU + GPU platform using OpenAI benchmarks. Our results show that the performance of our <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K$</tex> -ary sum tree based Prioritized Replay Buffer improves the baseline implementations by around 4x~100x. Our proposed synchronization optimizations improve the performance by around 2x~4.4x compared with using a global lock. By plugging our Replay Buffer implementation into existing open source reinforcement learning frameworks, we achieve 1.19x~ 1.75x speedup for various algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.