Abstract

This study reviews the major developments of Deep Deterministic Policy Gradient (DDPG) in the field of reinforcement learning. It is innovated by Deep Q-network ideas and can finally handle some much challenging problems that operate over continuous action space. The main idea of DDPG is to use an actor-critic architecture (shown in Figure 5) to learn much more competitive policies. It allows the model to use neural network function approximators to learn in large state and action spaces. Due to its strong capacity, DDPG has many useful applications to real world problems in the field like robotics and control systems. But like most of the model-free reinforcement learning methods, the requirement for a large number of training steps is still a major difficulty for DDPG.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.